noether

Type Members

case class AUC(metric: AUCMetric, samples: Int = 100) extends Aggregator[Prediction[Boolean, Double], MetricCurve, Double] with Product with Serializable

Compute the "Area Under the Curve" for a collection of predictions.
Compute the "Area Under the Curve" for a collection of predictions. Uses the Trapezoid method to compute the area.
Internally a linspace is defined using the given number of samples. Each point in the linspace represents a threshold which is used to build a confusion matrix. The area is then defined on this list of confusion matrices.
AUCMetric which is given to the aggregate selects the function to apply on the confusion matrix prior to the AUC calculation.
metric
Which function to apply on the confusion matrix.
samples
Number of samples to use for the curve definition.
sealed trait AUCMetric extends AnyRef

Which function to apply on the list of confusion matrices prior to the AUC calculation.
case class BinaryConfusionMatrix(threshold: Double = 0.5) extends Aggregator[Prediction[Boolean, Double], Map[(Int, Int), Long], DenseMatrix[Long]] with Product with Serializable

Special Case for a Binary Confusion Matrix to make it easier to compose with other binary aggregators
Special Case for a Binary Confusion Matrix to make it easier to compose with other binary aggregators
threshold
Threshold to apply on predictions
final case class CalibrationHistogram(lowerBound: Double = 0.0, upperBound: Double = 1.0, numBuckets: Int = 10) extends Aggregator[Prediction[Double, Double], Map[Double, (Double, Double, Long)], List[CalibrationHistogramBucket]] with Product with Serializable

Split predictions into Tensorflow Model Analysis compatible CalibrationHistogramBucket buckets.
Split predictions into Tensorflow Model Analysis compatible CalibrationHistogramBucket buckets.
If a prediction is less than the lower bound, it belongs to the bucket [-inf, lower bound) If it is greater than or equal to the upper bound, it belongs to the bucket (upper bound, inf]
lowerBound
Left boundary, inclusive
upperBound
Right boundary, exclusive
numBuckets
Number of buckets in the histogram
final case class CalibrationHistogramBucket(lowerThresholdInclusive: Double, upperThresholdExclusive: Double, numPredictions: Double, sumLabels: Double, sumPredictions: Double) extends Product with Serializable

Histogram bucket.
Histogram bucket.
lowerThresholdInclusive
Lower bound on bucket, inclusive
upperThresholdExclusive
Upper bound on bucket, exclusive
numPredictions
Number of predictions in this bucket
sumLabels
Sum of label values for this bucket
sumPredictions
Sum of prediction values for this bucket
final case class ClassificationReport(threshold: Double = 0.5, beta: Double = 1.0) extends Aggregator[Prediction[Boolean, Double], Map[(Int, Int), Long], Report] with Product with Serializable

Generate a Classification Report for a collection of binary predictions.
Generate a Classification Report for a collection of binary predictions. The output of this aggregator will be a Report object.
threshold
Threshold to apply to get the predictions.
beta
Beta parameter used in the f-score calculation.
final case class ConfusionMatrix(labels: Seq[Int]) extends Aggregator[Prediction[Int, Int], Map[(Int, Int), Long], DenseMatrix[Long]] with Product with Serializable

Generic Consfusion Matrix Aggregator for any dimension.
Generic Consfusion Matrix Aggregator for any dimension. Thresholds must be applied to make a prediction prior to using this aggregator.
labels
List of possible label values
case class Curve(metric: AUCMetric, samples: Int = 100) extends Aggregator[Prediction[Boolean, Double], MetricCurve, MetricCurvePoints] with Product with Serializable

Compute a series of points for a collection of predictions.
Compute a series of points for a collection of predictions.
Internally a linspace is defined using the given number of samples. Each point in the linspace represents a threshold which is used to build a confusion matrix. The (x,y) location of the line is then returned.
AUCMetric which is given to the aggregate selects the function to apply on the confusion matrix prior to the AUC calculation.
metric
Which function to apply on the confusion matrix.
samples
Number of samples to use for the curve definition.
case class MeanAveragePrecision[T]() extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable

Returns the mean average precision (MAP) of all the predictions.
Returns the mean average precision (MAP) of all the predictions. If a query has an empty ground truth set, the average precision will be zero
case class MetricCurve(cm: Array[Map[(Int, Int), Long]]) extends Serializable with Product
case class MetricCurvePoint(x: Double, y: Double) extends Serializable with Product
case class MetricCurvePoints(points: Array[MetricCurvePoint]) extends Serializable with Product
case class MultiAggregatorMap[-A, B, +C](aggregatorsMap: List[(String, Aggregator[A, B, C])]) extends Aggregator[A, List[B], Map[String, C]] with Product with Serializable

Aggregator which combines an unbounded list of other aggregators.
Aggregator which combines an unbounded list of other aggregators. Each aggregator in the list is tagged by a string. The string(aka name) could be used to retrieve the aggregated value from the Map emitted by the "present" function.
final case class MultiClassificationReport(labels: Seq[Int], beta: Double = 1.0) extends Aggregator[Prediction[Int, Int], Map[(Int, Int), Long], Map[Int, Report]] with Product with Serializable

Generate a Classification Report for a collection of multiclass predictions.
Generate a Classification Report for a collection of multiclass predictions. A report is generated for each class by treating the predictions as binary of either "class" or "not class". The output of this aggregator will be a map of classes and their Report objects.
labels
List of possible label values.
beta
Beta parameter used in the f-score calculation.
case class NdcgAtK[T](k: Int) extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable

Compute the average NDCG value of all the predictions, truncated at ranking position k.
Compute the average NDCG value of all the predictions, truncated at ranking position k. The discounted cumulative gain at position k is computed as: sum_i=1^k (2^{{relevance of ith item}} - 1) / log(i + 1), and the NDCG is obtained by dividing the DCG value on the ground truth set. In the current implementation, the relevance value is binary. If a query has an empty ground truth set, zero will be used as ndcg
See the following paper for detail:
IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
k
the position to compute the truncated ndcg, must be positive
case class PrecisionAtK[T](k: Int) extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable

Compute the average precision of all the predictions, truncated at ranking position k.
Compute the average precision of all the predictions, truncated at ranking position k.
If for a prediction, the ranking algorithm returns n (n is less than k) results, the precision value will be computed as #(relevant items retrieved) / k. This formula also applies when the size of the ground truth set is less than k.
If a prediction has an empty ground truth set, zero will be used as precision together
See the following paper for detail:
IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
k
the position to compute the truncated precision, must be positive
final case class Prediction[L, S](actual: L, predicted: S) extends Product with Serializable

Generic Prediction Object used by most aggregators
Generic Prediction Object used by most aggregators
L
Type of the Real Value
S
Type of the Predicted Value
actual
Real value for this entry. Also normally seen as label.
predicted
Predicted value. Can be a class or a score depending on the aggregator.
type RankingPrediction[T] = Prediction[Array[T], Array[T]]
final case class Report(mcc: Double, fscore: Double, precision: Double, recall: Double, accuracy: Double, fpr: Double) extends Product with Serializable

Classification Report
Classification Report
mcc
Matthews Correlation Coefficient
fscore
f-score
precision
Precision
recall
Recall
accuracy
Accuracy
fpr
False Positive Rate

Value Members

object ErrorRateSummary extends Aggregator[Prediction[Int, List[Double]], (Double, Long), Double] with Product with Serializable

Measurement of what percentage of values were predicted incorrectly.
object LogLoss extends Aggregator[Prediction[Int, List[Double]], (Double, Long), Double] with Product with Serializable

LogLoss of the predictions.
LogLoss of the predictions.
object PR extends AUCMetric with Product with Serializable

Precision Recall Curve
Precision Recall Curve
object ROC extends AUCMetric with Product with Serializable

Receiver operating characteristic Curve
Receiver operating characteristic Curve

package noether

Type Members

case class AUC(metric: AUCMetric, samples: Int = 100) extends Aggregator[Prediction[Boolean, Double], MetricCurve, Double] with Product with Serializable

sealed trait AUCMetric extends AnyRef

case class BinaryConfusionMatrix(threshold: Double = 0.5) extends Aggregator[Prediction[Boolean, Double], Map[(Int, Int), Long], DenseMatrix[Long]] with Product with Serializable

final case class CalibrationHistogram(lowerBound: Double = 0.0, upperBound: Double = 1.0, numBuckets: Int = 10) extends Aggregator[Prediction[Double, Double], Map[Double, (Double, Double, Long)], List[CalibrationHistogramBucket]] with Product with Serializable

final case class CalibrationHistogramBucket(lowerThresholdInclusive: Double, upperThresholdExclusive: Double, numPredictions: Double, sumLabels: Double, sumPredictions: Double) extends Product with Serializable

final case class ClassificationReport(threshold: Double = 0.5, beta: Double = 1.0) extends Aggregator[Prediction[Boolean, Double], Map[(Int, Int), Long], Report] with Product with Serializable

final case class ConfusionMatrix(labels: Seq[Int]) extends Aggregator[Prediction[Int, Int], Map[(Int, Int), Long], DenseMatrix[Long]] with Product with Serializable

case class Curve(metric: AUCMetric, samples: Int = 100) extends Aggregator[Prediction[Boolean, Double], MetricCurve, MetricCurvePoints] with Product with Serializable

case class MeanAveragePrecision[T]() extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable

case class MetricCurve(cm: Array[Map[(Int, Int), Long]]) extends Serializable with Product

case class MetricCurvePoint(x: Double, y: Double) extends Serializable with Product

case class MetricCurvePoints(points: Array[MetricCurvePoint]) extends Serializable with Product

case class MultiAggregatorMap[-A, B, +C](aggregatorsMap: List[(String, Aggregator[A, B, C])]) extends Aggregator[A, List[B], Map[String, C]] with Product with Serializable

final case class MultiClassificationReport(labels: Seq[Int], beta: Double = 1.0) extends Aggregator[Prediction[Int, Int], Map[(Int, Int), Long], Map[Int, Report]] with Product with Serializable

case class NdcgAtK[T](k: Int) extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable

case class PrecisionAtK[T](k: Int) extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable

final case class Prediction[L, S](actual: L, predicted: S) extends Product with Serializable

type RankingPrediction[T] = Prediction[Array[T], Array[T]]

final case class Report(mcc: Double, fscore: Double, precision: Double, recall: Double, accuracy: Double, fpr: Double) extends Product with Serializable

Value Members

object ErrorRateSummary extends Aggregator[Prediction[Int, List[Double]], (Double, Long), Double] with Product with Serializable

object LogLoss extends Aggregator[Prediction[Int, List[Double]], (Double, Long), Double] with Product with Serializable

object PR extends AUCMetric with Product with Serializable

object ROC extends AUCMetric with Product with Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped