trait Operators extends AnyRef
High level missing value imputation operators. The NaN values in the input data are treated as missing values and will be replaced with imputed values after the processing.
- Alphabetic
- By Inheritance
- Operators
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
- def +(other: String): String
- def ->[B](y: B): (Operators, B)
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
avgimpute(data: Array[Array[Double]]): Unit
Impute missing values with the average of other attributes in the instance.
Impute missing values with the average of other attributes in the instance. Assume the attributes of the dataset are of same kind, e.g. microarray gene expression data, the missing values can be estimated as the average of non-missing attributes in the same instance. Note that this is not the average of same attribute across different instances.
- data
the data set with missing values.
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
- def ensuring(cond: (Operators) ⇒ Boolean, msg: ⇒ Any): Operators
- def ensuring(cond: (Operators) ⇒ Boolean): Operators
- def ensuring(cond: Boolean, msg: ⇒ Any): Operators
- def ensuring(cond: Boolean): Operators
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
- def formatted(fmtstr: String): String
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
-
def
impute(data: Array[Array[Double]], k: Int, runs: Int = 1): Unit
Missing value imputation by K-Means clustering.
Missing value imputation by K-Means clustering. First cluster data by K-Means with missing values and then impute missing values with the average value of each attribute in the clusters.
- data
the data set.
- k
the number of clusters.
- runs
the number of runs of K-Means algorithm.
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
knnimpute(data: Array[Array[Double]], k: Int): Unit
Missing value imputation by k-nearest neighbors.
Missing value imputation by k-nearest neighbors. The KNN-based method selects instances similar to the instance of interest to impute missing values. If we consider instance A that has one missing value on attribute i, this method would find K other instances, which have a value present on attribute i, with values most similar (in term of some distance, e.g. Euclidean distance) to A on other attributes without missing values. The average of values on attribute i from the K nearest neighbors is then used as an estimate for the missing value in instance A.
- data
the data set with missing values.
- k
the number of neighbors.
-
def
llsimpute(data: Array[Array[Double]], k: Int): Unit
Local least squares missing value imputation.
Local least squares missing value imputation. The local least squares imputation method represents a target instance that has missing values as a linear combination of similar instances, which are selected by k-nearest neighbors method.
- data
the data set.
- k
the number of similar rows used for imputation.
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
-
def
svdimpute(data: Array[Array[Double]], k: Int, maxIter: Int = 10): Unit
Missing value imputation with singular value decomposition.
Missing value imputation with singular value decomposition. Given SVD A = U Σ VT, we use the most significant eigenvectors of VT to linearly estimate missing values. Although it has been shown that several significant eigenvectors are sufficient to describe the data with small errors, the exact fraction of eigenvectors best for estimation needs to be determined empirically. Once k most significant eigenvectors from VT are selected, we estimate a missing value j in row i by first regressing this row against the k eigenvectors and then use the coefficients of the regression to reconstruct j from a linear combination of the k eigenvectors. The j th value of row i and the j th values of the k eigenvectors are not used in determining these regression coefficients. It should be noted that SVD can only be performed on complete matrices; therefore we originally fill all missing values by other methods in matrix A, obtaining A'. We then utilize an expectation maximization method to arrive at the final estimate, as follows. Each missing value in A is estimated using the above algorithm, and then the procedure is repeated on the newly obtained matrix, until the total change in the matrix falls below the empirically determined threshold (say 0.01).
- data
the data set.
- k
the number of eigenvectors used for imputation.
- maxIter
the maximum number of iterations.
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
- def →[B](y: B): (Operators, B)
High level Smile operators in Scala.