Specify random number seed.
- Version:
- $Revision: 1.13 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
- See Also:
Clusterer
,
OptionHandler
,
Serialized Form
Method Summary |
void |
buildClusterer(Instances data)
Generates a clusterer. |
int |
clusterInstance(Instance instance)
Classifies a given instance. |
private int |
clusterProcessedInstance(Instance instance)
clusters an instance that has been through the filters |
private double |
difference(int index,
double val1,
double val2)
Computes the difference between two given attribute
values. |
private double |
distance(Instance first,
Instance second)
Calculates the distance between two instances |
int |
getNumClusters()
gets the number of clusters to generate |
java.lang.String[] |
getOptions()
Gets the current settings of SimpleKMeans |
int |
getSeed()
Get the random number seed |
java.lang.String |
globalInfo()
Returns a string describing this clusterer |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
private double |
norm(double x,
int i)
Normalizes a given value of a numeric attribute. |
int |
numberOfClusters()
Returns the number of clusters. |
java.lang.String |
numClustersTipText()
Returns the tip text for this property |
java.lang.String |
seedTipText()
Returns the tip text for this property |
void |
setNumClusters(int n)
set the number of clusters to generate |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setSeed(int s)
Set the random number seed |
java.lang.String |
toString()
return a string describing this clusterer |
private void |
updateMinMax(Instance instance)
Updates the minimum and maximum values for all the attributes
based on a new instance. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
m_ReplaceMissingFilter
private ReplaceMissingValues m_ReplaceMissingFilter
- replace missing values in training instances
m_NumClusters
private int m_NumClusters
- number of clusters to generate
m_ClusterCentroids
private Instances m_ClusterCentroids
- holds the cluster centroids
m_ClusterStdDevs
private Instances m_ClusterStdDevs
- Holds the standard deviations of attributes in each cluster
m_Seed
private int m_Seed
- random seed
m_Min
private double[] m_Min
- attribute min values
m_Max
private double[] m_Max
- attribute max values
m_Iterations
private int m_Iterations
- Keep track of the number of iterations completed before convergence
SimpleKMeans
public SimpleKMeans()
globalInfo
public java.lang.String globalInfo()
- Returns a string describing this clusterer
- Returns:
- a description of the evaluator suitable for
displaying in the explorer/experimenter gui
buildClusterer
public void buildClusterer(Instances data)
throws java.lang.Exception
- Generates a clusterer. Has to initialize all fields of the clusterer
that are not being set via options.
- Specified by:
buildClusterer
in class Clusterer
- Parameters:
data
- set of instances serving as training data
- Throws:
java.lang.Exception
- if the clusterer has not been
generated successfully
clusterProcessedInstance
private int clusterProcessedInstance(Instance instance)
- clusters an instance that has been through the filters
- Parameters:
instance
- the instance to assign a cluster to
- Returns:
- a cluster number
clusterInstance
public int clusterInstance(Instance instance)
throws java.lang.Exception
- Classifies a given instance.
- Overrides:
clusterInstance
in class Clusterer
- Parameters:
instance
- the instance to be assigned to a cluster
- Returns:
- the number of the assigned cluster as an interger
if the class is enumerated, otherwise the predicted value
- Throws:
java.lang.Exception
- if instance could not be classified
successfully
distance
private double distance(Instance first,
Instance second)
- Calculates the distance between two instances
- Returns:
- the distance between the two given instances, between 0 and 1
difference
private double difference(int index,
double val1,
double val2)
- Computes the difference between two given attribute
values.
norm
private double norm(double x,
int i)
- Normalizes a given value of a numeric attribute.
- Parameters:
x
- the value to be normalizedi
- the attribute's index
updateMinMax
private void updateMinMax(Instance instance)
- Updates the minimum and maximum values for all the attributes
based on a new instance.
- Parameters:
instance
- the new instance
numberOfClusters
public int numberOfClusters()
throws java.lang.Exception
- Returns the number of clusters.
- Specified by:
numberOfClusters
in class Clusterer
- Returns:
- the number of clusters generated for a training dataset.
- Throws:
java.lang.Exception
- if number of clusters could not be returned
successfully
listOptions
public java.util.Enumeration listOptions()
- Returns an enumeration describing the available options..
Valid options are:
-N
Specify the number of clusters to generate. If omitted,
EM will use cross validation to select the number of clusters
automatically.
-S
Specify random number seed.
- Specified by:
listOptions
in interface OptionHandler
- Returns:
- an enumeration of all the available options.
numClustersTipText
public java.lang.String numClustersTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setNumClusters
public void setNumClusters(int n)
- set the number of clusters to generate
- Parameters:
n
- the number of clusters to generate
getNumClusters
public int getNumClusters()
- gets the number of clusters to generate
- Returns:
- the number of clusters to generate
seedTipText
public java.lang.String seedTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setSeed
public void setSeed(int s)
- Set the random number seed
- Parameters:
s
- the seed
getSeed
public int getSeed()
- Get the random number seed
- Returns:
- the seed
setOptions
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
- Parses a given list of options.
- Specified by:
setOptions
in interface OptionHandler
- Parameters:
options
- the list of options as an array of strings
- Throws:
java.lang.Exception
- if an option is not supported
getOptions
public java.lang.String[] getOptions()
- Gets the current settings of SimpleKMeans
- Specified by:
getOptions
in interface OptionHandler
- Returns:
- an array of strings suitable for passing to setOptions()
toString
public java.lang.String toString()
- return a string describing this clusterer
- Returns:
- a description of the clusterer as a string
main
public static void main(java.lang.String[] argv)
- Main method for testing this class.
- Parameters:
argv
- should contain the following arguments:
-t training file [-N number of clusters]