jml.classification
Class Classifier

java.lang.Object
  extended by jml.classification.Classifier
All Implemented Interfaces:
java.io.Serializable
Direct Known Subclasses:
AdaBoost, LogisticRegressionMCBoundConstrainedPLBFGS, LogisticRegressionMCGradientDescent, LogisticRegressionMCLBFGS, LogisticRegressionMCLBFGS_Ori, LogisticRegressionMCNonlinearConjugateGradient, LogisticRegressionMCNonnegativePLBFGS, MaxEnt, MultiClassSVM

public abstract class Classifier
extends java.lang.Object
implements java.io.Serializable

Abstract super class for all classifier subclasses.

Version:
1.0 Dec. 30th, 2012
Author:
Mingjie Qian
See Also:
Serialized Form

Field Summary
 double epsilon
          Convergence tolerance.
(package private)  int[] IDLabelMap
          An ID to integer label mapping array.
(package private)  int[] labelIDs
          LabelID array for training data, starting from 0.
(package private)  int[] labels
          Label array for training data with original integer code.
 int nClass
          Number of classes.
 int nExample
          Number of samples.
 int nFeature
          Number of features, without bias dummy features, i.e., for SVM.
private static long serialVersionUID
           
 org.apache.commons.math.linear.RealMatrix W
          Projection matrix (nFeature x nClass), column i is the projector for class i.
 org.apache.commons.math.linear.RealMatrix X
          Training data matrix (nFeature x nExample), each column is a feature vector.
 org.apache.commons.math.linear.RealMatrix Y
          Label matrix for training (nExample x nClass).
 
Constructor Summary
Classifier()
          Default constructor for a classifier.
Classifier(Options options)
          Constructor for a classifier initialized with options wrapped in a Options object.
 
Method Summary
static int calcNumClass(int[] labels)
          Infer the number of classes from a given label sequence.
 void feedData(double[][] data)
          Feed training data for this classification method.
 void feedData(org.apache.commons.math.linear.RealMatrix X)
          Feed training data with original data matrix for this classifier.
 void feedLabels(double[][] labels)
          Feed labels for this classification method.
 void feedLabels(int[] labels)
          Feed labels of training data to the classifier.
 void feedLabels(org.apache.commons.math.linear.RealMatrix Y)
          Feed labels for training data from a matrix.
static double getAccuracy(int[] pre_labels, int[] labels)
          Get accuracy for a classification task.
static int[] getIDLabelMap(int[] labels)
          Get an ID to integer label mapping array.
static java.util.TreeMap<java.lang.Integer,java.lang.Integer> getLabelIDMap(int[] labels)
          Get a mapping from labels to IDs.
 org.apache.commons.math.linear.RealMatrix getProjectionMatrix()
          Get projection matrix for this classifier.
 org.apache.commons.math.linear.RealMatrix getTrainingLabelMatrix()
          Get ground truth label matrix for training data.
static org.apache.commons.math.linear.RealMatrix labelIndexArray2LabelMatrix(int[] labelIndices, int nClass)
          Convert a label index array to a label matrix.
static int[] labelScoreMatrix2LabelIndexArray(org.apache.commons.math.linear.RealMatrix Y)
          Convert a label matrix to a label index array.
abstract  void loadModel(java.lang.String filePath)
          Load the model for a classifier.
 int[] predict(double[][] Xt)
          Predict the labels for the test data formated as an original 2D double array.
 int[] predict(org.apache.commons.math.linear.RealMatrix Xt)
          Predict the labels for the test data formated as an original data matrix.
 org.apache.commons.math.linear.RealMatrix predictLabelMatrix(double[][] Xt)
          Predict the label matrix given test data formated as an original 2D double array.
 org.apache.commons.math.linear.RealMatrix predictLabelMatrix(org.apache.commons.math.linear.RealMatrix Xt)
          Predict the label matrix given test data formated as an original data matrix.
 org.apache.commons.math.linear.RealMatrix predictLabelScoreMatrix(double[][] Xt)
          Predict the label score matrix given test data formated as an original data matrix.
abstract  org.apache.commons.math.linear.RealMatrix predictLabelScoreMatrix(org.apache.commons.math.linear.RealMatrix Xt)
          Predict the label score matrix given test data formated as an original data matrix.
abstract  void saveModel(java.lang.String filePath)
          Save the model for a classifier.
abstract  void train()
          Train the classifier.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serialVersionUID

private static final long serialVersionUID
See Also:
Constant Field Values

nClass

public int nClass
Number of classes.


nFeature

public int nFeature
Number of features, without bias dummy features, i.e., for SVM.


nExample

public int nExample
Number of samples.


X

public org.apache.commons.math.linear.RealMatrix X
Training data matrix (nFeature x nExample), each column is a feature vector. The data matrix should not include bias dummy features.


Y

public org.apache.commons.math.linear.RealMatrix Y
Label matrix for training (nExample x nClass). Y_{i,k} = 1 if x_i belongs to class k, and 0 otherwise.


labelIDs

int[] labelIDs
LabelID array for training data, starting from 0. The label ID array for the training data is latent, and we don't need to know them. They are only meaningful for reconstructing the integer labels by using IDLabelMap structure.


labels

int[] labels
Label array for training data with original integer code.


W

public org.apache.commons.math.linear.RealMatrix W
Projection matrix (nFeature x nClass), column i is the projector for class i.


epsilon

public double epsilon
Convergence tolerance.


IDLabelMap

int[] IDLabelMap
An ID to integer label mapping array. IDs start from 0.

Constructor Detail

Classifier

public Classifier()
Default constructor for a classifier.


Classifier

public Classifier(Options options)
Constructor for a classifier initialized with options wrapped in a Options object.

Parameters:
options - classification options
Method Detail

loadModel

public abstract void loadModel(java.lang.String filePath)
Load the model for a classifier.

Parameters:
filePath - file path to load the model

saveModel

public abstract void saveModel(java.lang.String filePath)
Save the model for a classifier.

Parameters:
filePath - file path to save the model

feedData

public void feedData(org.apache.commons.math.linear.RealMatrix X)
Feed training data with original data matrix for this classifier.

Parameters:
X - original data matrix without bias dummy features

feedData

public void feedData(double[][] data)
Feed training data for this classification method.

Parameters:
data - a d x n 2D double array with each column being a data sample

calcNumClass

public static int calcNumClass(int[] labels)
Infer the number of classes from a given label sequence.

Parameters:
labels - any integer array holding the original integer labels
Returns:
number of classes

getIDLabelMap

public static int[] getIDLabelMap(int[] labels)
Get an ID to integer label mapping array. IDs start from 0.

Parameters:
labels - any integer array holding the original integer labels
Returns:
ID to integer label mapping array

getLabelIDMap

public static java.util.TreeMap<java.lang.Integer,java.lang.Integer> getLabelIDMap(int[] labels)
Get a mapping from labels to IDs. IDs start from 0.

Parameters:
labels - any integer array holding the original integer labels
Returns:
a mapping from labels to IDs

feedLabels

public void feedLabels(int[] labels)
Feed labels of training data to the classifier.

Parameters:
labels - any integer array holding the original integer labels

feedLabels

public void feedLabels(org.apache.commons.math.linear.RealMatrix Y)
Feed labels for training data from a matrix. Note that if we feed the classifier with only label matrix, then we don't have original integer labels actually. In this case, label IDs will be inferred according to the label matrix. The first observed label index will be assigned ID 0, the second observed label index will be assigned ID 1, and so on. And labels will be the label indices in the given label matrix

Parameters:
Y - an N x K label matrix, where N is the number of training samples, and K is the number of classes

feedLabels

public void feedLabels(double[][] labels)
Feed labels for this classification method.

Parameters:
labels - an n x c 2D double array

train

public abstract void train()
Train the classifier.


predict

public int[] predict(org.apache.commons.math.linear.RealMatrix Xt)
Predict the labels for the test data formated as an original data matrix. The original data matrix should not include bias dummy features.

Parameters:
Xt - test data matrix with each column being a feature vector
Returns:
predicted label array with original integer label code

predict

public int[] predict(double[][] Xt)
Predict the labels for the test data formated as an original 2D double array. The original data matrix should not include bias dummy features.

Parameters:
Xt - a d x n 2D double array with each column being a data sample
Returns:
predicted label array with original integer label code

predictLabelMatrix

public org.apache.commons.math.linear.RealMatrix predictLabelMatrix(org.apache.commons.math.linear.RealMatrix Xt)
Predict the label matrix given test data formated as an original data matrix. Note that if a method of an abstract class is declared as abstract, it is implemented as an interface function in Java. Thus subclasses need to implement this abstract method rather than to override it.

Parameters:
Xt - test data matrix with each column being a feature vector
Returns:
predicted N x K label matrix, where N is the number of test samples, and K is the number of classes

predictLabelMatrix

public org.apache.commons.math.linear.RealMatrix predictLabelMatrix(double[][] Xt)
Predict the label matrix given test data formated as an original 2D double array.

Parameters:
Xt - a d x n 2D double array with each column being a data sample
Returns:
predicted N x K label matrix, where N is the number of test samples, and K is the number of classes

predictLabelScoreMatrix

public abstract org.apache.commons.math.linear.RealMatrix predictLabelScoreMatrix(org.apache.commons.math.linear.RealMatrix Xt)
Predict the label score matrix given test data formated as an original data matrix. Note that if a method of an abstract class is declared as abstract, it is implemented as an interface function in Java. Thus subclass needs to implement this abstract method rather than to override it.

Parameters:
Xt - test data matrix with each column being a feature vector
Returns:
predicted N x K label score matrix, where N is the number of test samples, and K is the number of classes

predictLabelScoreMatrix

public org.apache.commons.math.linear.RealMatrix predictLabelScoreMatrix(double[][] Xt)
Predict the label score matrix given test data formated as an original data matrix.

Parameters:
Xt - a d x n 2D double array with each column being a data sample
Returns:
predicted N x K label score matrix, where N is the number of test samples, and K is the number of classes

getAccuracy

public static double getAccuracy(int[] pre_labels,
                                 int[] labels)
Get accuracy for a classification task.

Parameters:
pre_labels - predicted labels
labels - true labels
Returns:
accuracy

getProjectionMatrix

public org.apache.commons.math.linear.RealMatrix getProjectionMatrix()
Get projection matrix for this classifier.

Returns:
a d x c projection matrix

getTrainingLabelMatrix

public org.apache.commons.math.linear.RealMatrix getTrainingLabelMatrix()
Get ground truth label matrix for training data.

Returns:
an n x c label matrix

labelScoreMatrix2LabelIndexArray

public static int[] labelScoreMatrix2LabelIndexArray(org.apache.commons.math.linear.RealMatrix Y)
Convert a label matrix to a label index array. Label indices start from 0.

Parameters:
Y - label matrix
Returns:
a label index array

labelIndexArray2LabelMatrix

public static org.apache.commons.math.linear.RealMatrix labelIndexArray2LabelMatrix(int[] labelIndices,
                                                                                    int nClass)
Convert a label index array to a label matrix. Label indices start from 0.

Parameters:
labelIndices - a label index array
nClass - number of classes
Returns:
label matrix