public class BIRCH extends java.lang.Object implements VectorQuantizer
BIRCH has several advantages. For example, each clustering decision is made without scanning all data points and currently existing clusters. It exploits the observation that data space is not usually uniformly occupied and not every data point is equally important. It makes full use of available memory to derive the finest possible sub-clusters while minimizing I/O costs. It is also an incremental method that does not require the whole data set in advance.
This implementation produces a clustering in three steps. First step builds a CF (clustering feature) tree by a single scan of database. The second step clusters the leaves of CF tree by hierarchical clustering. Then the user can use the learned model to classify input data in the final step. In total, we scan the database twice.
HierarchicalClustering
,
Serialized FormModifier and Type | Field and Description |
---|---|
int |
B
The branching factor of non-leaf nodes.
|
int |
d
The dimensionality of data.
|
int |
L
The number of CF entries in the leaf nodes.
|
double |
T
THe maximum radius of a sub-cluster.
|
OUTLIER
Constructor and Description |
---|
BIRCH(int d,
int B,
int L,
double T)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
double[][] |
centroids()
Returns the cluster centroids of leaf nodes.
|
double[] |
quantize(double[] x)
Quantize a new observation.
|
void |
update(double[] x)
Update the codebook with a new observation.
|
public final int B
public final int L
public final double T
public final int d
public BIRCH(int d, int B, int L, double T)
d
- the dimensionality of data.B
- the branching factor of non-leaf nodes, i.e. the maximum number
of children nodes.L
- the number entries in the leaf nodes.T
- the maximum radius of a sub-cluster.public void update(double[] x)
VectorQuantizer
update
in interface VectorQuantizer
public double[] quantize(double[] x)
VectorQuantizer
quantize
in interface VectorQuantizer
x
- a new observation.public double[][] centroids()