weka.datagenerators
Class Generator

java.lang.Object
  extended byweka.datagenerators.Generator
All Implemented Interfaces:
java.io.Serializable
Direct Known Subclasses:
RDG1

public abstract class Generator
extends java.lang.Object
implements java.io.Serializable

Abstract class for data generators. -------------------------------------------------------------------

General options are:

-r string
Name of the relation of the generated dataset.
(default = name built using name of used generator and options)

-a num
Number of attributes. (default = 10)

-c num
Number of classes. (default = 2)

-n num
Number of examples. (default = 100)

-o filename
writes the generated dataset to the given file using ARFF-Format. (default = stdout). -------------------------------------------------------------------

Example usage as the main of a datagenerator called RandomGenerator:

 public static void main(String [] args) {
   try {
     DataGenerator.makeData(new RandomGenerator(), argv);
   } catch (Exception e) {
     System.err.println(e.getMessage());
   }
 }
 

------------------------------------------------------------------

Version:
$Revision: 1.1 $
Author:
Gabi Schmidberger (gabi@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  boolean m_Debug
           
private  Instances m_Format
           
private  int m_NumAttributes
           
private  int m_NumClasses
           
private  int m_NumExamples
           
private  int m_NumExamplesAct
           
private  java.io.PrintWriter m_Output
           
private  java.lang.String m_RelationName
           
 
Constructor Summary
Generator()
           
 
Method Summary
(package private) abstract  Instances defineDataFormat()
          Initializes the format for the dataset produced.
(package private) abstract  Instance generateExample()
          Generates one example of the dataset.
(package private) abstract  Instances generateExamples()
          Generates all examples of the dataset.
(package private) abstract  java.lang.String generateFinished()
          Generates a comment string that documentats the data generator.
 boolean getDebug()
          Gets the debug flag.
protected  Instances getFormat()
          Gets the format of the dataset that is to be generated.
private  java.lang.String[] getGenericOptions()
          Gets the current generic settings of the datagenerator.
 int getNumAttributes()
          Gets the number of attributes that should be produced.
 int getNumClasses()
          Gets the number of classes the dataset should have.
 int getNumExamples()
          Gets the number of examples, given by option.
 int getNumExamplesAct()
          Gets the number of examples the dataset should have.
 java.io.PrintWriter getOutput()
          Gets the print writer.
 java.lang.String getRelationName()
          Gets the relation name the dataset should have.
(package private) abstract  boolean getSingleModeFlag()
          Return if single mode is set for the given data generator mode depends on option setting and or generator type.
private static java.lang.String listGenericOptions(Generator generator)
          Method for listing generic options.
private  java.lang.String listSpecificOptions(Generator generator)
          Makes a string with the options of the specific data generator.
static void makeData(Generator generator, java.lang.String[] options)
          Calls the data generator.
 void setDebug(boolean debug)
          Sets the debug flag.
protected  void setFormat(Instances newFormat)
          Sets the format of the dataset that is to be generated.
 void setNumAttributes(int numAttributes)
          Sets the number of attributes the dataset should have.
 void setNumClasses(int numClasses)
          Sets the number of classes the dataset should have.
 void setNumExamples(int numExamples)
          Sets the number of examples, given by option.
 void setNumExamplesAct(int numExamplesAct)
          Sets the number of examples the dataset should have.
private static void setOptions(Generator generator, java.lang.String[] options)
          Sets the generic options and specific options.
 void setOutput(java.io.PrintWriter newOutput)
          Sets the print writer.
 void setRelationName(java.lang.String relationName)
          Sets the relation name the dataset should have.
protected  java.lang.String toStringFormat()
          Returns a string representing the dataset in the instance queue.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_Debug

private boolean m_Debug

m_Format

private Instances m_Format

m_RelationName

private java.lang.String m_RelationName

m_NumAttributes

private int m_NumAttributes

m_NumClasses

private int m_NumClasses

m_NumExamples

private int m_NumExamples

m_NumExamplesAct

private int m_NumExamplesAct

m_Output

private java.io.PrintWriter m_Output
Constructor Detail

Generator

public Generator()
Method Detail

defineDataFormat

abstract Instances defineDataFormat()
                             throws java.lang.Exception
Initializes the format for the dataset produced. Must be called before the generateExample or generateExamples methods are used.

Returns:
the format for the dataset
Throws:
java.lang.Exception - if the generating of the format failed

generateExample

abstract Instance generateExample()
                           throws java.lang.Exception
Generates one example of the dataset.

Returns:
the generated example
Throws:
java.lang.Exception - if the format of the dataset is not yet defined
java.lang.Exception - if the generator only works with generateExamples which means in non single mode

generateExamples

abstract Instances generateExamples()
                             throws java.lang.Exception
Generates all examples of the dataset.

Returns:
the generated dataset
Throws:
java.lang.Exception - if the format of the dataset is not yet defined
java.lang.Exception - if the generator only works with generateExample, which means in single mode

generateFinished

abstract java.lang.String generateFinished()
                                    throws java.lang.Exception
Generates a comment string that documentats the data generator. By default this string is added at the end of theproduces output as ARFF file type.

Returns:
string contains info about the generated rules
Throws:
java.lang.Exception - if the generating of the documentaion fails

getSingleModeFlag

abstract boolean getSingleModeFlag()
                            throws java.lang.Exception
Return if single mode is set for the given data generator mode depends on option setting and or generator type.

Returns:
single mode flag
Throws:
java.lang.Exception - if mode is not set yet

setDebug

public void setDebug(boolean debug)
Sets the debug flag.

Parameters:
debug - the new debug flag

getDebug

public boolean getDebug()
Gets the debug flag.

Returns:
the debug flag

setRelationName

public void setRelationName(java.lang.String relationName)
Sets the relation name the dataset should have.

Parameters:
relationName - the new relation name

getRelationName

public java.lang.String getRelationName()
Gets the relation name the dataset should have.

Returns:
the relation name the dataset should have

setNumClasses

public void setNumClasses(int numClasses)
Sets the number of classes the dataset should have.

Parameters:
numClasses - the new number of classes

getNumClasses

public int getNumClasses()
Gets the number of classes the dataset should have.

Returns:
the number of classes the dataset should have

setNumExamples

public void setNumExamples(int numExamples)
Sets the number of examples, given by option.

Parameters:
numExamples - the new number of examples

getNumExamples

public int getNumExamples()
Gets the number of examples, given by option.

Returns:
the number of examples, given by option

setNumAttributes

public void setNumAttributes(int numAttributes)
Sets the number of attributes the dataset should have.

Parameters:
numAttributes - the new number of attributes

getNumAttributes

public int getNumAttributes()
Gets the number of attributes that should be produced.

Returns:
the number of attributes that should be produced

setNumExamplesAct

public void setNumExamplesAct(int numExamplesAct)
Sets the number of examples the dataset should have.

Parameters:
numExamplesAct - the new number of examples

getNumExamplesAct

public int getNumExamplesAct()
Gets the number of examples the dataset should have.

Returns:
the number of examples the dataset should have

setOutput

public void setOutput(java.io.PrintWriter newOutput)
Sets the print writer.

Parameters:
newOutput - the new print writer

getOutput

public java.io.PrintWriter getOutput()
Gets the print writer.

Returns:
print writer object

setFormat

protected void setFormat(Instances newFormat)
Sets the format of the dataset that is to be generated.


getFormat

protected Instances getFormat()
Gets the format of the dataset that is to be generated.

Returns:
the dataset format of the dataset

toStringFormat

protected java.lang.String toStringFormat()
Returns a string representing the dataset in the instance queue.

Returns:
the string representing the output data format

makeData

public static void makeData(Generator generator,
                            java.lang.String[] options)
                     throws java.lang.Exception
Calls the data generator.

Parameters:
options - options of the data generator
Throws:
java.lang.Exception - if there was an error in the option list

listSpecificOptions

private java.lang.String listSpecificOptions(Generator generator)
Makes a string with the options of the specific data generator.

Parameters:
generator - the datagenerator that is used
Returns:
string with the options of the data generator used

setOptions

private static void setOptions(Generator generator,
                               java.lang.String[] options)
                        throws java.lang.Exception
Sets the generic options and specific options.

Parameters:
generator - the data generator used
options - the generic options and the specific options
Throws:
java.lang.Exception - if help request or any invalid option

listGenericOptions

private static java.lang.String listGenericOptions(Generator generator)
Method for listing generic options.

Parameters:
generator - the data generator
Returns:
string with the generic data generator options

getGenericOptions

private java.lang.String[] getGenericOptions()
Gets the current generic settings of the datagenerator.

Returns:
an array of strings suitable for passing to setOptions