Class/Object

com.databricks.labs.automl.model.tools.split

DataSplitUtility

Related Docs: object DataSplitUtility | package split

Permalink

class DataSplitUtility extends SplitUtilityTooling

Train / Test split handler class

Since

0.7.1

Linear Supertypes
SplitUtilityTooling, SparkSessionWrapper, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataSplitUtility
  2. SplitUtilityTooling
  3. SparkSessionWrapper
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataSplitUtility(mainDataset: DataFrame, kIterations: Int, splitMethod: String, labelColumn: String, rootDir: String, persistMode: String, modelFamily: String, parallelism: Int, trainPortion: Double, syntheticCol: String, trainSplitChronologicalColumn: String, trainSplitChronologicalRandomPercentage: Double, reductionFactor: Double)

    Permalink

    mainDataset

    Dataset that contains feature vector, out of DataPrep phase, ready to be split into

    kIterations

    number of 'copies' of the split to perform in order to fulfill the number of kFold models to be built

    splitMethod

    The type of split being performed (i.e. 'stratified', 'random', 'kSample')

    labelColumn

    Name of the label column

    rootDir

    Source directory to use to build the delta persisted data sets if using 'delta' mode in persistMode

    persistMode

    'cache', 'persist' or 'delta' - how to retain each of the kFold train/test splits.

    modelFamily

    The model family in order to determine how many parts in which to repartition the train and test splits for optimal performance.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. def formRootPath(configStoreLocation: String): String

    Permalink
    Definition Classes
    SplitUtilityTooling
  10. def formTrainTestPaths(configStoreLocation: String): TrainTestPaths

    Permalink
    Definition Classes
    SplitUtilityTooling
  11. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  12. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. def performSplit: Array[TrainSplitReferences]

    Permalink

    Wrapper interface for performing the splits, dependent on mode

    Wrapper interface for performing the splits, dependent on mode

    returns

    Array[TrainSplitReferences] from the above methods.

  18. lazy val sc: SparkContext

    Permalink
    Definition Classes
    SparkSessionWrapper
  19. lazy val spark: SparkSession

    Permalink
    Definition Classes
    SparkSessionWrapper
  20. def storeLoadDelta(trainData: DataFrame, testData: DataFrame, paths: TrainTestPaths): TrainTestData

    Permalink
    Definition Classes
    SplitUtilityTooling
  21. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  22. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  23. final val uniqueLabels: Array[Row]

    Permalink
  24. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from SplitUtilityTooling

Inherited from SparkSessionWrapper

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped