Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.clustering.biclustering
Class AbstractBiclustering<V extends RealVector<V,Double>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
          extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<V>
              extended by de.lmu.ifi.dbs.elki.algorithm.clustering.biclustering.AbstractBiclustering<V>
Type Parameters:
V - a certain subtype of RealVector - the data matrix is supposed to consist of rows where each row relates to an object of type V and the columns relate to the attribute values of these objects
All Implemented Interfaces:
Algorithm<V>, Loggable, Parameterizable

public abstract class AbstractBiclustering<V extends RealVector<V,Double>>
extends AbstractAlgorithm<V>

Abstract class as a convenience for different biclustering approaches.

The typically required values describing submatrices are computed using the corresponding values within a database of RealVectors.

The database is supposed to present a data matrix with a row representing an entry (RealVector), a column representing a dimension (attribute) of the RealVectors.

Author:
Arthur Zimek

Field Summary
private  int[] colIDs
          The column ids corresponding to the currently set database.
private  Database<V> database
          Keeps the currently set database.
private  Biclustering<V> result
          Keeps the result.
private  int[] rowIDs
          The row ids corresponding to the currently set database.
 
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
optionHandler
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug
 
Constructor Summary
AbstractBiclustering()
           
 
Method Summary
protected  void addBiclusterToResult(Bicluster<V> bicluster)
          Adds the given Bicluster to the result of this Biclustering.
protected  void addInvertedRows(Bicluster<V> bicluster, BitSet invertedRows)
          Adds the ids of the inverted rows as specified to the given bicluster.
protected abstract  void biclustering()
          Any concrete biclustering algorithm should be implemented within this method.
protected  Bicluster<V> defineBicluster(BitSet rows, BitSet cols)
          Defines a bicluster as given by the included rows and columns.
protected  int getColDim()
          Provides the number of columns of the data matrix.
 Result<V> getResult()
          Returns the result of the algorithm.
protected  int getRowDim()
          Provides the number of rows of the data matrix.
protected  double meanOfBicluster(BitSet rows, BitSet cols)
          Provides the mean of all entries in the submatrix as specified by a set of columns and a set of rows.
protected  double meanOfCol(BitSet rows, int col)
          Provides the mean value for a column on a set of rows.
protected  double meanOfRow(int row, BitSet cols)
          Provides the mean value for a row on a set of columns.
protected  void runInTime(Database<V> database)
          Prepares the algorithm for running on a specific database.
private
<P> void
sort(int[] ids, int from, int to, List<P> properties, Comparator<P> comp)
          Sorts an array based on specified properties.
protected
<P> void
sortCols(int from, int to, List<P> properties, Comparator<P> comp)
          Sorts the columns.
protected
<P> void
sortRows(int from, int to, List<P> properties, Comparator<P> comp)
          Sorts the rows.
protected  double valueAt(int row, int col)
          Returns the value of the data matrix at row row and column col.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
description, isTime, isVerbose, run, setParameters, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
addOption, checkGlobalParameterConstraints, deleteOption, description, description, getAttributeSettings, getParameters, getParameterValue, getPossibleOptions, inlineDescription, isSet, setParameters
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, message, progress, progress, progress, verbose, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm
getDescription
 
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable
checkGlobalParameterConstraints, getAttributeSettings, getParameters, getPossibleOptions, inlineDescription
 

Field Detail

database

private Database<V extends RealVector<V,Double>> database
Keeps the currently set database.


rowIDs

private int[] rowIDs
The row ids corresponding to the currently set database.


colIDs

private int[] colIDs
The column ids corresponding to the currently set database.


result

private Biclustering<V extends RealVector<V,Double>> result
Keeps the result. A new ResultObject is assigned when the method runInTime(Database) is called.

Constructor Detail

AbstractBiclustering

public AbstractBiclustering()
Method Detail

runInTime

protected final void runInTime(Database<V> database)
                        throws IllegalStateException
Prepares the algorithm for running on a specific database.

Assigns the database, the row ids, and the col ids, then calls biclustering().

Any concrete algorithm should be implemented within method biclustering() by an inheriting biclustering approach.

Specified by:
runInTime in class AbstractAlgorithm<V extends RealVector<V,Double>>
Parameters:
database - the database to run the algorithm on
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).
See Also:
AbstractAlgorithm.runInTime(de.lmu.ifi.dbs.elki.database.Database)

biclustering

protected abstract void biclustering()
                              throws IllegalStateException
Any concrete biclustering algorithm should be implemented within this method. The database of double-valued RealVectors is encapsulated, methods sortRows(int,int,List,Comparator), sortCols(int,int,List,Comparator), meanOfBicluster(BitSet,BitSet), meanOfRow(int,BitSet), meanOfCol(BitSet,int), valueAt(int,int), allow typical operations like on a data matrix.

This method is supposed to be called only from the method runInTime(Database).

If a bicluster is to be appended to the result, the methods defineBicluster(BitSet,BitSet) and addBiclusterToResult(Bicluster) should be used.

Throws:
IllegalStateException - if the properties are not set properly (e.g. method is not called from method runInTime(Database), but directly)

defineBicluster

protected Bicluster<V> defineBicluster(BitSet rows,
                                       BitSet cols)
Defines a bicluster as given by the included rows and columns.

Parameters:
rows - the rows included in the bicluster
cols - the columns included in the bicluster
Returns:
a bicluster as given by the included rows and columns

addInvertedRows

protected void addInvertedRows(Bicluster<V> bicluster,
                               BitSet invertedRows)
Adds the ids of the inverted rows as specified to the given bicluster.

Parameters:
bicluster - the bicluster where to add the ids of inverted rows
invertedRows - specifies the inverted rows

addBiclusterToResult

protected void addBiclusterToResult(Bicluster<V> bicluster)
Adds the given Bicluster to the result of this Biclustering.

Parameters:
bicluster - the bicluster to add to the result

sortRows

protected <P> void sortRows(int from,
                            int to,
                            List<P> properties,
                            Comparator<P> comp)
Sorts the rows. The rows of the data matrix within the range from row from (inclusively) to row to (exclusively) are sorted according to the specified properties and Comparator.

The List of properties must be of size to - from and reflect the properties corresponding to the row ids rowIDs[from] to rowIDs[to-1].

Type Parameters:
P - the type of properties suitable to the comparator
Parameters:
from - begin of range to be sorted (inclusively)
to - end of range to be sorted (exclusively)
properties - the properties to sort the rows of the data matrix according to
comp - a Comparator suitable to the type of properties

sortCols

protected <P> void sortCols(int from,
                            int to,
                            List<P> properties,
                            Comparator<P> comp)
Sorts the columns. The columns of the data matrix within the range from column from (inclusively) to column to (exclusively) are sorted according to the specified properties and Comparator.

The List of properties must be of size to - from and reflect the properties corresponding to the column ids colIDs[from] to colIDs[to-1].

Type Parameters:
P - the type of properties suitable to the comparator
Parameters:
from - begin of range to be sorted (inclusively)
to - end of range to be sorted (exclusively)
properties - the properties to sort the columns of the data matrix according to
comp - a Comparator suitable to the type of properties

sort

private <P> void sort(int[] ids,
                      int from,
                      int to,
                      List<P> properties,
                      Comparator<P> comp)
Sorts an array based on specified properties. The array of ids is sorted within the range from index from (inclusively) to index to (exclusively) according to the specified properties and Comparator. The List of properties must be of size to - from and reflect the properties corresponding to ids ids[from] to ids[to-1].

Type Parameters:
P - the type of properties suitable to the comparator
Parameters:
ids - the ids to sort
from - begin of range to be sorted (inclusively)
to - end of range to be sorted (exclusively)
properties - the properties to sort the ids according to
comp - a Comparator suitable to the type of properties

valueAt

protected double valueAt(int row,
                         int col)
Returns the value of the data matrix at row row and column col.

Parameters:
row - the row in the data matrix according to the current order of rows (refers to database entry database.get(rowIDs[row]))
col - the column in the data matrix according to the current order of rows (refers to the attribute value of an database entry getValue(colIDs[col]))
Returns:
the attribute value of the database entry as retrieved by database.get(rowIDs[row]).getValue(colIDs[col])

meanOfRow

protected double meanOfRow(int row,
                           BitSet cols)
Provides the mean value for a row on a set of columns. The columns are specified by a BitSet where the indices of a set bit relate to the indices in colIDs.

Parameters:
row - the row to compute the mean value w.r.t. the given set of columns (relates to database entry id rowIDs[row])
cols - the set of columns to include in the computation of the mean of the given row
Returns:
the mean value of the specified row over the specified columns

meanOfCol

protected double meanOfCol(BitSet rows,
                           int col)
Provides the mean value for a column on a set of rows. The rows are specified by a BitSet where the indices of a set bit relate to the indices in rowIDs.

Parameters:
rows - the set of rows to include in the computation of the mean of the given column
col - the column index to compute the mean value w.r.t. the given set of rows (relates to attribute colIDs[col] of the corresponding database entries)
Returns:
the mean value of the specified column over the specified rows

meanOfBicluster

protected double meanOfBicluster(BitSet rows,
                                 BitSet cols)
Provides the mean of all entries in the submatrix as specified by a set of columns and a set of rows.

Parameters:
rows - the set of rows to include in the computation of the mean of the submatrix
cols - the set of columns to include in the computation of the mean of the submatrix
Returns:
the mean of all entries in the submatrix

getResult

public Result<V> getResult()
Description copied from interface: Algorithm
Returns the result of the algorithm.

Returns:
the result of the algorithm
See Also:
Algorithm.getResult()

getRowDim

protected int getRowDim()
Provides the number of rows of the data matrix.

Returns:
the number of rows of the data matrix

getColDim

protected int getColDim()
Provides the number of columns of the data matrix.

Returns:
the number of columns of the data matrix

Release 0.1 (2008-07-10_1838)