jml.clustering
Class Clustering

java.lang.Object
  extended by jml.clustering.Clustering
Direct Known Subclasses:
KMeans, L1NMF, LLNMF, MRNMF, SpectralClustering

public abstract class Clustering
extends java.lang.Object

Abstract class for clustering algorithms.

Version:
1.0, Jan. 3rd, 2013
Author:
Mingjie Qian

Field Summary
protected  org.apache.commons.math.linear.RealMatrix centers
          Cluster matrix (nFeature x nClus), column i is the projector for class i.
protected  org.apache.commons.math.linear.RealMatrix dataMatrix
          Data matrix (nFeature x nSample), each column is a feature vector
protected  org.apache.commons.math.linear.RealMatrix indicatorMatrix
          Cluster indicator matrix (nSample x nClus).
 int nClus
          Number of clusters.
 int nFeature
          Number of features.
 int nSample
          Number of samples.
 
Constructor Summary
Clustering()
          Default constructor for this clustering algorithm.
Clustering(ClusteringOptions clusteringOptions)
          Constructor for this clustering algorithm initialized with options wrapped in a ClusteringOptions object.
Clustering(int nClus)
          Constructor for this clustering algorithm given number of clusters to be set.
 
Method Summary
abstract  void clustering()
          Do clustering.
 void clustering(org.apache.commons.math.linear.RealMatrix G0)
          Do clustering with a specified initializer.
 void feedData(double[][] data)
          Feed training data for this feature selection algorithm.
 void feedData(org.apache.commons.math.linear.RealMatrix dataMatrix)
          Feed training data for this clustering algorithm.
static double getAccuracy(org.apache.commons.math.linear.RealMatrix G, org.apache.commons.math.linear.RealMatrix groundTruth)
          Evaluating the clustering performance of this clustering algorithm by using the ground truth.
 org.apache.commons.math.linear.RealMatrix getCenters()
          Get cluster centers.
 org.apache.commons.math.linear.RealMatrix getData()
          Fetch data matrix.
 org.apache.commons.math.linear.RealMatrix getIndicatorMatrix()
          Get cluster indicator matrix.
 void initialize(org.apache.commons.math.linear.RealMatrix G0)
          Initialize the indicator matrix.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

nClus

public int nClus
Number of clusters.


nFeature

public int nFeature
Number of features.


nSample

public int nSample
Number of samples.


dataMatrix

protected org.apache.commons.math.linear.RealMatrix dataMatrix
Data matrix (nFeature x nSample), each column is a feature vector


indicatorMatrix

protected org.apache.commons.math.linear.RealMatrix indicatorMatrix
Cluster indicator matrix (nSample x nClus).


centers

protected org.apache.commons.math.linear.RealMatrix centers
Cluster matrix (nFeature x nClus), column i is the projector for class i.

Constructor Detail

Clustering

public Clustering()
Default constructor for this clustering algorithm.


Clustering

public Clustering(ClusteringOptions clusteringOptions)
Constructor for this clustering algorithm initialized with options wrapped in a ClusteringOptions object.

Parameters:
clusteringOptions - clustering options

Clustering

public Clustering(int nClus)
Constructor for this clustering algorithm given number of clusters to be set.

Parameters:
nClus - number of clusters
Method Detail

feedData

public void feedData(org.apache.commons.math.linear.RealMatrix dataMatrix)
Feed training data for this clustering algorithm.

Parameters:
dataMatrix - a d x n data matrix with each column being a data example

feedData

public void feedData(double[][] data)
Feed training data for this feature selection algorithm.

Parameters:
data - a d x n 2D double array with each column being a data example

initialize

public void initialize(org.apache.commons.math.linear.RealMatrix G0)
Initialize the indicator matrix.

Parameters:
G0 - initial indicator matrix

clustering

public abstract void clustering()
Do clustering. Please call initialize() before using this method.


clustering

public void clustering(org.apache.commons.math.linear.RealMatrix G0)
Do clustering with a specified initializer. Please use null if you want to use random initialization.

Parameters:
G0 - initial indicator matrix, if null random initialization will be used

getData

public org.apache.commons.math.linear.RealMatrix getData()
Fetch data matrix.

Returns:
a d x n data matrix

getCenters

public org.apache.commons.math.linear.RealMatrix getCenters()
Get cluster centers.

Returns:
a d x K basis matrix

getIndicatorMatrix

public org.apache.commons.math.linear.RealMatrix getIndicatorMatrix()
Get cluster indicator matrix.

Returns:
an n x K cluster indicator matrix

getAccuracy

public static double getAccuracy(org.apache.commons.math.linear.RealMatrix G,
                                 org.apache.commons.math.linear.RealMatrix groundTruth)
Evaluating the clustering performance of this clustering algorithm by using the ground truth.

Parameters:
G - predicted cluster indicator matrix
groundTruth - true cluster assignments
Returns:
evaluation metrics