Package

com.enriquegrodrigo.spark.crowd

methods

Permalink

package methods

Visibility
  1. Public
  2. All

Value Members

  1. object DawidSkene

    Permalink

    Provides functions for transforming an annotation dataset into a standard label dataset using the DawidSkene algorithm

    Provides functions for transforming an annotation dataset into a standard label dataset using the DawidSkene algorithm

    This algorithm only works with com.enriquegrodrigo.spark.crowd.types.MulticlassAnnotation datasets

    Example:
    1. result: DawidSkeneModel = DawidSkene(dataset)
    Version

    0.1

    See also

    Dawid, Alexander Philip, and Allan M. Skene. "Maximum likelihood estimation of observer error-rates using the EM algorithm." Applied statistics (1979): 20-28.

  2. object Glad

    Permalink

    Provides functions for transforming an annotation dataset into a standard label dataset using the Glad algorithm

    Provides functions for transforming an annotation dataset into a standard label dataset using the Glad algorithm

    This algorithm only works with com.enriquegrodrigo.spark.crowd.types.BinaryAnnotation datasets

    Example:
    1. result: GladModel = Glad(dataset)
    See also

    Whitehill, Jacob, et al. "Whose vote should count more: Optimal integration of labels from labelers of unknown expertise." Advances in neural information processing systems. 2009.

  3. object MajorityVoting

    Permalink

    Provides functions for transforming an annotation dataset into a standard label dataset using the majority voting approach

    Provides functions for transforming an annotation dataset into a standard label dataset using the majority voting approach

    This object provides several functions for using majority voting style algorithms over annotations datasets (spark datasets with types com.enriquegrodrigo.spark.crowd.types.BinaryAnnotation, com.enriquegrodrigo.spark.crowd.types.MulticlassAnnotation, or com.enriquegrodrigo.spark.crowd.types.RealAnnotation). For discrete types (com.enriquegrodrigo.spark.crowd.types.BinaryAnnotation, com.enriquegrodrigo.spark.crowd.types.MulticlassAnnotation) the method uses the most frequent class. For continuous types, the mean is used.

    The object also provides methods for estimating the probability of a class for the discrete type, computing, for the binary case, the mean of the positive class and, for the multiclass case, the one vs all mean of a class against the others.

    Example:
    1. result: Dataset[BinaryLabel] = MajorityVoting.transformBinary(dataset)
    Version

    0.1

  4. object RaykarBinary

    Permalink

    Provides functions for transforming an annotation dataset into a standard label dataset using the RaykarBinary algorithm

    Provides functions for transforming an annotation dataset into a standard label dataset using the RaykarBinary algorithm

    This algorithm only works with com.enriquegrodrigo.spark.crowd.types.BinaryAnnotation datasets

    Example:
    1. result: RaykarBinaryModel  = RaykarBinary(dataset)
    Version

    0.1

    See also

    Raykar, Vikas C., et al. "Learning from crowds." Journal of Machine Learning Research 11.Apr (2010): 1297-1322.

  5. object RaykarCont

    Permalink
  6. object RaykarMulti

    Permalink

    Provides functions for transforming an annotation dataset into a standard label dataset using the Raykar algorithm for multiclass

    Provides functions for transforming an annotation dataset into a standard label dataset using the Raykar algorithm for multiclass

    This algorithm only works with com.enriquegrodrigo.spark.crowd.types.MulticlassAnnotation annotation datasets

    Example:
    1. result: RaykarMultiModel = RaykarMulti(dataset, annotations)
    Version

    0.1

    See also

    Raykar, Vikas C., et al. "Learning from crowds." Journal of Machine Learning Research 11.Apr (2010): 1297-1322.

Ungrouped