Class/Object

com.salesforce.op.stages.impl.tuning

DataBalancer

Related Docs: object DataBalancer | package tuning

Permalink

class DataBalancer extends Splitter with DataBalancerParams

Instance that will split the data into train and holdout and then balance the dataset before modeling binary classifications

Linear Supertypes
DataBalancerParams, Splitter, SplitterParams, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataBalancer
  2. DataBalancerParams
  3. Splitter
  4. SplitterParams
  5. Params
  6. Serializable
  7. Serializable
  8. Identifiable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataBalancer(uid: String = UID[DataBalancer])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. final def clear(param: Param[_]): DataBalancer.this.type

    Permalink
    Definition Classes
    Params
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def copy(extra: ParamMap): DataBalancer

    Permalink
    Definition Classes
    DataBalancer → Params
  9. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  10. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  13. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  14. def explainParams(): String

    Permalink
    Definition Classes
    Params
  15. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  16. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  17. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  18. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  19. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  20. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  21. def getMaxTrainingSample: Int

    Permalink
    Definition Classes
    DataBalancerParams
  22. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  23. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  24. def getProportions(smallCount: Double, bigCount: Double, sampleF: Double, maxTrainingSample: Int): (Double, Double)

    Permalink

    Computes the upSample and downSample proportions.

    Computes the upSample and downSample proportions.

    smallCount

    size of minority class data

    bigCount

    size of majority class data

    sampleF

    targeted fraction of small data

    maxTrainingSample

    maximum training size

    returns

    downSample & upSample proportions

  25. def getReserveTestFraction: Double

    Permalink
    Definition Classes
    SplitterParams
  26. def getSampleFraction: Double

    Permalink
    Definition Classes
    DataBalancerParams
  27. def getSeed: Long

    Permalink
    Definition Classes
    SplitterParams
  28. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  29. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  30. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  31. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  32. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  33. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  34. final val maxTrainingSample: IntParam

    Permalink

    Maximum size of dataset want to train on.

    Maximum size of dataset want to train on. Value should be > 0. Default is 5000.

    Definition Classes
    DataBalancerParams
  35. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  36. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  37. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  38. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  39. def prepare(data: Dataset[Row]): ModelData

    Permalink

    Split into a training set and a test set and balance the training set

    Split into a training set and a test set and balance the training set

    data

    to prepare for model training. first column must be the label as a double

    returns

    balanced training set and a test set

    Definition Classes
    DataBalancerSplitter
  40. final val reserveTestFraction: DoubleParam

    Permalink

    Fraction of data to reserve for test Default is 0.1

    Fraction of data to reserve for test Default is 0.1

    Definition Classes
    SplitterParams
  41. final val sampleFraction: DoubleParam

    Permalink

    Targeted sample fraction for the class in minority.

    Targeted sample fraction for the class in minority. Value should be in ]0.0, 1.0[ Default is 0.1.

    Definition Classes
    DataBalancerParams
  42. final val seed: LongParam

    Permalink

    Seed for data splitting

    Seed for data splitting

    Definition Classes
    SplitterParams
  43. final def set(paramPair: ParamPair[_]): DataBalancer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  44. final def set(param: String, value: Any): DataBalancer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  45. final def set[T](param: Param[T], value: T): DataBalancer.this.type

    Permalink
    Definition Classes
    Params
  46. final def setDefault(paramPairs: ParamPair[_]*): DataBalancer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  47. final def setDefault[T](param: Param[T], value: T): DataBalancer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  48. def setMaxTrainingSample(value: Int): DataBalancer.this.type

    Permalink
    Definition Classes
    DataBalancerParams
  49. def setReserveTestFraction(value: Double): DataBalancer.this.type

    Permalink
    Definition Classes
    SplitterParams
  50. def setSampleFraction(value: Double): DataBalancer.this.type

    Permalink
    Definition Classes
    DataBalancerParams
  51. def setSeed(value: Long): DataBalancer.this.type

    Permalink
    Definition Classes
    SplitterParams
  52. def split[T](data: Dataset[T]): (Dataset[T], Dataset[T])

    Permalink

    Function to use to create the training set and test set.

    Function to use to create the training set and test set.

    returns

    (dataTrain, dataTest)

    Definition Classes
    Splitter
  53. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  54. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  55. val uid: String

    Permalink
    Definition Classes
    Splitter → Identifiable
  56. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  57. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  58. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from DataBalancerParams

Inherited from Splitter

Inherited from SplitterParams

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

param

Ungrouped