org.apache.commons.math.stat.inference
Interface UnknownDistributionChiSquareTest

All Superinterfaces:
ChiSquareTest
All Known Implementing Classes:
ChiSquareTestImpl

public interface UnknownDistributionChiSquareTest
extends ChiSquareTest

An interface for Chi-Square tests for unknown distributions.

Two samples tests are used when the distribution is unknown a priori but provided by one sample. We compare the second sample against the first.

Since:
1.2
Version:
$Revision: 811685 $ $Date: 2009-09-05 19:36:48 +0200 (sam. 05 sept. 2009) $

Method Summary
 double chiSquareDataSetsComparison(long[] observed1, long[] observed2)
          Computes a Chi-Square two sample test statistic comparing bin frequency counts in observed1 and observed2.
 double chiSquareTestDataSetsComparison(long[] observed1, long[] observed2)
          Returns the observed significance level, or p-value, associated with a Chi-Square two sample test comparing bin frequency counts in observed1 and observed2.
 boolean chiSquareTestDataSetsComparison(long[] observed1, long[] observed2, double alpha)
          Performs a Chi-Square two sample test comparing two binned data sets.
 
Methods inherited from interface org.apache.commons.math.stat.inference.ChiSquareTest
chiSquare, chiSquare, chiSquareTest, chiSquareTest, chiSquareTest, chiSquareTest
 

Method Detail

chiSquareDataSetsComparison

double chiSquareDataSetsComparison(long[] observed1,
                                   long[] observed2)
                                   throws IllegalArgumentException

Computes a Chi-Square two sample test statistic comparing bin frequency counts in observed1 and observed2. The sums of frequency counts in the two samples are not required to be the same. The formula used to compute the test statistic is

∑[(K * observed1[i] - observed2[i]/K)2 / (observed1[i] + observed2[i])] where
K = &sqrt;[&sum(observed2 / ∑(observed1)]

This statistic can be used to perform a Chi-Square test evaluating the null hypothesis that both observed counts follow the same distribution.

Preconditions:

If any of the preconditions are not met, an IllegalArgumentException is thrown.

Parameters:
observed1 - array of observed frequency counts of the first data set
observed2 - array of observed frequency counts of the second data set
Returns:
chiSquare statistic
Throws:
IllegalArgumentException - if preconditions are not met

chiSquareTestDataSetsComparison

double chiSquareTestDataSetsComparison(long[] observed1,
                                       long[] observed2)
                                       throws IllegalArgumentException,
                                              MathException

Returns the observed significance level, or p-value, associated with a Chi-Square two sample test comparing bin frequency counts in observed1 and observed2.

The number returned is the smallest significance level at which one can reject the null hypothesis that the observed counts conform to the same distribution.

See chiSquareDataSetsComparison(long[], long[]) for details on the formula used to compute the test statistic. The degrees of of freedom used to perform the test is one less than the common length of the input observed count arrays.

Preconditions:

If any of the preconditions are not met, an IllegalArgumentException is thrown.

Parameters:
observed1 - array of observed frequency counts of the first data set
observed2 - array of observed frequency counts of the second data set
Returns:
p-value
Throws:
IllegalArgumentException - if preconditions are not met
MathException - if an error occurs computing the p-value

chiSquareTestDataSetsComparison

boolean chiSquareTestDataSetsComparison(long[] observed1,
                                        long[] observed2,
                                        double alpha)
                                        throws IllegalArgumentException,
                                               MathException

Performs a Chi-Square two sample test comparing two binned data sets. The test evaluates the null hypothesis that the two lists of observed counts conform to the same frequency distribution, with significance level alpha. Returns true iff the null hypothesis can be rejected with 100 * (1 - alpha) percent confidence.

See chiSquareDataSetsComparison(long[], long[]) for details on the formula used to compute the Chisquare statistic used in the test. The degrees of of freedom used to perform the test is one less than the common length of the input observed count arrays.

Preconditions:

If any of the preconditions are not met, an IllegalArgumentException is thrown.

Parameters:
observed1 - array of observed frequency counts of the first data set
observed2 - array of observed frequency counts of the second data set
alpha - significance level of the test
Returns:
true iff null hypothesis can be rejected with confidence 1 - alpha
Throws:
IllegalArgumentException - if preconditions are not met
MathException - if an error occurs performing the test


Copyright © 2003-2011 The Apache Software Foundation. All Rights Reserved.