jml.topics
Class LDA

java.lang.Object
  extended by jml.topics.TopicModel
      extended by jml.topics.LDA

public class LDA
extends TopicModel


Field Summary
(package private)  LdaGibbsSampler gibbsSampler
           
 
Fields inherited from class jml.topics.TopicModel
dataMatrix, indicatorMatrix, nTopic, topicMatrix
 
Constructor Summary
LDA()
           
LDA(int nTopic)
           
LDA(LDAOptions LDAOptions)
           
 
Method Summary
static void main(java.lang.String[] args)
           
 void readCorpus(java.util.ArrayList<java.util.TreeMap<java.lang.Integer,java.lang.Integer>> docTermCountArray)
          Load corpus and documents from a ArrayList<TreeMap<Integer, Integer>> instance.
 void readCorpus(int[][] documents)
          Feed documents from a 2D integer array.
 void readCorpus(org.apache.commons.math.linear.RealMatrix X)
          Load corpus and documents from a RealMatrix instance.
 void readCorpus(java.lang.String LDAInputDataFilePath)
          Load corpus and documents from a LDAInput file.
 void readCorpusFromDocTermCountFile(java.lang.String docTermCountFilePath)
          Load corpus and documents from a text file located at String docTermCountFilePath.
 void train()
          Train this topic model to fit the given corpus.
 
Methods inherited from class jml.topics.TopicModel
getIndicatorMatrix, getTopicMatrix
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

gibbsSampler

LdaGibbsSampler gibbsSampler
Constructor Detail

LDA

public LDA(LDAOptions LDAOptions)

LDA

public LDA()

LDA

public LDA(int nTopic)
Method Detail

main

public static void main(java.lang.String[] args)
Parameters:
args -

train

public void train()
Description copied from class: TopicModel
Train this topic model to fit the given corpus.

Specified by:
train in class TopicModel

readCorpus

public void readCorpus(java.util.ArrayList<java.util.TreeMap<java.lang.Integer,java.lang.Integer>> docTermCountArray)
Load corpus and documents from a ArrayList<TreeMap<Integer, Integer>> instance. Each element of the ArrayList is a doc-term count mapping.

Parameters:
docTermCountArray - A ArrayList<TreeMap<Integer, Integer>> instance, each element of the ArrayList records the doc-term count mapping for the corresponding document.

readCorpus

public void readCorpus(java.lang.String LDAInputDataFilePath)
Load corpus and documents from a LDAInput file. Term indices must start from 0.

Parameters:
LDAInputDataFilePath - The file path specifying the path of the LDAInput file.

readCorpusFromDocTermCountFile

public void readCorpusFromDocTermCountFile(java.lang.String docTermCountFilePath)
Load corpus and documents from a text file located at String docTermCountFilePath.

Parameters:
docTermCountFilePath - A String specifying the location of the text file holding doc-term-count matrix data.

readCorpus

public void readCorpus(int[][] documents)
Feed documents from a 2D integer array.

Parameters:
documents - a 2D integer array where documents[m][n] is the term index in the vocabulary for the n-th word of the m-th document. Indices always start from 0.

readCorpus

public void readCorpus(org.apache.commons.math.linear.RealMatrix X)
Load corpus and documents from a RealMatrix instance.

Overrides:
readCorpus in class TopicModel
Parameters:
X - a matrix with each column being a term count vector for a document with X(i, j) being the number of occurrence for the i-th vocabulary term in the j-th document