Class/Object

com.salesforce.op.stages.impl.tuning

DataCutter

Related Docs: object DataCutter | package tuning

Permalink

class DataCutter extends Splitter with DataCutterParams

Instance that will make a holdout set and prepare the data for multiclass modeling Creates instance that will split data into training and test set filtering out any labels that don't meet the minimum fraction cutoff or fall in the top N labels specified.

Linear Supertypes
DataCutterParams, Splitter, SplitterParams, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataCutter
  2. DataCutterParams
  3. Splitter
  4. SplitterParams
  5. Params
  6. Serializable
  7. Serializable
  8. Identifiable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataCutter(uid: String = UID[DataCutter])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. final def clear(param: Param[_]): DataCutter.this.type

    Permalink
    Definition Classes
    Params
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def copy(extra: ParamMap): DataCutter

    Permalink
    Definition Classes
    DataCutter → Params
  9. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  10. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  13. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  14. def explainParams(): String

    Permalink
    Definition Classes
    Params
  15. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  16. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  17. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  18. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  19. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  20. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  21. def getMaxLabelCategories: Int

    Permalink
    Definition Classes
    DataCutterParams
  22. def getMinLabelFraction: Double

    Permalink
    Definition Classes
    DataCutterParams
  23. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  24. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  25. def getReserveTestFraction: Double

    Permalink
    Definition Classes
    SplitterParams
  26. def getSeed: Long

    Permalink
    Definition Classes
    SplitterParams
  27. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  28. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  29. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  30. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  31. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  32. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  33. final val maxLabelCategories: IntParam

    Permalink
    Definition Classes
    DataCutterParams
  34. final val minLabelFraction: DoubleParam

    Permalink
    Definition Classes
    DataCutterParams
  35. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  36. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  37. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  38. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  39. def prepare(data: Dataset[Row]): ModelData

    Permalink

    function to use to prepare the dataset for modeling eg - do data balancing or dropping based on the labels

    function to use to prepare the dataset for modeling eg - do data balancing or dropping based on the labels

    data

    first column must be the label as a double

    returns

    Training set test set

    Definition Classes
    DataCutterSplitter
  40. final val reserveTestFraction: DoubleParam

    Permalink

    Fraction of data to reserve for test Default is 0.1

    Fraction of data to reserve for test Default is 0.1

    Definition Classes
    SplitterParams
  41. final val seed: LongParam

    Permalink

    Seed for data splitting

    Seed for data splitting

    Definition Classes
    SplitterParams
  42. final def set(paramPair: ParamPair[_]): DataCutter.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  43. final def set(param: String, value: Any): DataCutter.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  44. final def set[T](param: Param[T], value: T): DataCutter.this.type

    Permalink
    Definition Classes
    Params
  45. final def setDefault(paramPairs: ParamPair[_]*): DataCutter.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  46. final def setDefault[T](param: Param[T], value: T): DataCutter.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  47. def setMaxLabelCategories(value: Int): DataCutter.this.type

    Permalink
    Definition Classes
    DataCutterParams
  48. def setMinLabelFraction(value: Double): DataCutter.this.type

    Permalink
    Definition Classes
    DataCutterParams
  49. def setReserveTestFraction(value: Double): DataCutter.this.type

    Permalink
    Definition Classes
    SplitterParams
  50. def setSeed(value: Long): DataCutter.this.type

    Permalink
    Definition Classes
    SplitterParams
  51. def split[T](data: Dataset[T]): (Dataset[T], Dataset[T])

    Permalink

    Function to use to create the training set and test set.

    Function to use to create the training set and test set.

    returns

    (dataTrain, dataTest)

    Definition Classes
    Splitter
  52. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  53. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  54. val uid: String

    Permalink
    Definition Classes
    Splitter → Identifiable
  55. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  56. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  57. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from DataCutterParams

Inherited from Splitter

Inherited from SplitterParams

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

param

Ungrouped