|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object weka.datagenerators.Generator weka.datagenerators.RDG1
Class to generate data randomly by producing a decision list. The decision list consists of rules. Instances are generated randomly one by one. If decision list fails to classify the current instance, a new rule according to this current instance is generated and added to the decision list.
The option -V switches on voting, which means that at the end of the generation all instances are reclassified to the class value that is supported by the most rules.
This data generator can generate 'boolean' attributes (= nominal with the values {true, false}) and numeric attributes. The rules can be 'A' or 'NOT A' for boolean values and 'B < random_value' or 'B >= random_value' for numeric values.
Valid options are:
-R num
The maximum number of attributes chosen to form a rule (default 10).
-M num
The minimum number of attributes chosen to form a rule (default 1).
-I num
The number of irrelevant attributes (default 0).
-N num
The number of numeric attributes (default 0).
-S seed
Random number seed for random function used (default 1).
-V
Flag to use voting.
Following an example of a generated dataset:
%
% weka.datagenerators.RDG1 -r expl -a 2 -c 3 -n 4 -N 1 -I 0 -M 2 -R 10 -S 2
%
relation expl
attribute a0 {false,true}
attribute a1 numeric
attribute class {c0,c1,c2}
data
true,0.496823,c0
false,0.743158,c1
false,0.408285,c1
false,0.993687,c2
%
% Number of attributes chosen as irrelevant = 0
%
% DECISIONLIST (number of rules = 3):
% RULE 0: c0 := a1 < 0.986, a0
% RULE 1: c1 := a1 < 0.95, not(a0)
% RULE 2: c2 := not(a0), a1 >= 0.562
Nested Class Summary | |
private class |
RDG1.RuleList
|
Field Summary | |
(package private) boolean[] |
m_AttList_Irr
|
private Instances |
m_DatasetFormat
|
private int |
m_Debug
|
private FastVector |
m_DecisionList
|
private int |
m_MaxRuleSize
|
private int |
m_MinRuleSize
|
private int |
m_NumIrrelevant
|
private int |
m_NumNumeric
|
private java.util.Random |
m_Random
|
private int |
m_Seed
|
private boolean |
m_VoteFlag
|
Fields inherited from class weka.datagenerators.Generator |
|
Constructor Summary | |
RDG1()
|
Method Summary | |
private boolean |
classifyExample(Instance example)
Tries to classify an example. |
Instances |
defineDataFormat()
Initializes the format for the dataset produced. |
private Instances |
defineDataset(java.util.Random random)
Returns a dataset header. |
private boolean[] |
defineIrrelevant(java.util.Random random)
Defines randomly the attributes as irrelevant. |
private int[] |
defineNumeric(java.util.Random random)
Chooses randomly the attributes that get datatyp numeric. |
Instance |
generateExample()
Generate an example of the dataset dataset. |
private Instance |
generateExample(java.util.Random random,
Instances format)
Generates an example with its classvalue set to missing and binds it to the datasets. |
Instances |
generateExamples()
Generate all examples of the dataset. |
Instances |
generateExamples(int num,
java.util.Random random,
Instances format)
Generate all examples of the dataset. |
java.lang.String |
generateFinished()
Compiles documentation about the data generation. |
private FastVector |
generateTestList(java.util.Random random,
Instance example)
Generates a new rule for the decision list and classifies the new example. |
boolean[] |
getAttList_Irr()
Gets the array that defines which of the attributes are seen to be irrelevant. |
Instances |
getDatasetFormat()
Gets the dataset format. |
int |
getMaxRuleSize()
Gets the maximum number of tests in rules. |
int |
getMinRuleSize()
Gets the minimum number of tests in rules. |
int |
getNumIrrelevant()
Gets the number of irrelevant attributes. |
int |
getNumNumeric()
Gets the number of numerical attributes. |
java.lang.String[] |
getOptions()
Gets the current settings of the datagenerator RDG1. |
java.util.Random |
getRandom()
Gets the random generator. |
int |
getSeed()
Gets the random number seed. |
boolean |
getSingleModeFlag()
Gets the single mode flag. |
boolean |
getVoteFlag()
Gets the vote flag. |
java.lang.String |
globalInfo()
Returns a string describing this data generator. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
void |
setAttList_Irr(boolean[] newAttList_Irr)
Sets the array that defines which of the attributes are seen to be irrelevant. |
void |
setDatasetFormat(Instances newDatasetFormat)
Sets the dataset format. |
void |
setMaxRuleSize(int newMaxRuleSize)
Sets the maximum number of tests in rules. |
void |
setMinRuleSize(int newMinRuleSize)
Sets the minimum number of tests in rules. |
void |
setNumIrrelevant(int newNumIrrelevant)
Sets the number of irrelevant attributes. |
void |
setNumNumeric(int newNumNumeric)
Sets the number of numerical attributes. |
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object. |
void |
setRandom(java.util.Random newRandom)
Sets the random generator. |
void |
setSeed(int newSeed)
Sets the random number seed. |
void |
setVoteFlag(boolean newVoteFlag)
Sets the vote flag. |
private Instance |
updateDecisionList(java.util.Random random,
Instance example)
Generates a new rule for the decision list. |
private Instances |
voteDataset(Instances dataset)
Resets the class values of all instances using voting. |
private Instance |
votedReclassifyExample(Instance example)
Classify example with maximum vote the following way. |
Methods inherited from class weka.datagenerators.Generator |
getDebug, getFormat, getNumAttributes, getNumClasses, getNumExamples, getNumExamplesAct, getOutput, getRelationName, makeData, setDebug, setFormat, setNumAttributes, setNumClasses, setNumExamples, setNumExamplesAct, setOutput, setRelationName, toStringFormat |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private int m_MaxRuleSize
private int m_MinRuleSize
private int m_NumIrrelevant
private int m_NumNumeric
private int m_Seed
private boolean m_VoteFlag
private Instances m_DatasetFormat
private java.util.Random m_Random
private FastVector m_DecisionList
boolean[] m_AttList_Irr
private int m_Debug
Constructor Detail |
public RDG1()
Method Detail |
public java.lang.String globalInfo()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
For list of valid options see class description.
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public java.util.Random getRandom()
public void setRandom(java.util.Random newRandom)
newRandom
- is the random generator.public int getMaxRuleSize()
public void setMaxRuleSize(int newMaxRuleSize)
newMaxRuleSize
- new maximum number of tests allowed in rules.public int getMinRuleSize()
public void setMinRuleSize(int newMinRuleSize)
newMinRuleSize
- new minimum number of test in rules.public int getNumIrrelevant()
public void setNumIrrelevant(int newNumIrrelevant)
public int getNumNumeric()
public void setNumNumeric(int newNumNumeric)
public boolean getVoteFlag()
public void setVoteFlag(boolean newVoteFlag)
newVoteFlag
- boolean with the new setting of the vote flag.public boolean getSingleModeFlag()
getSingleModeFlag
in class Generator
public int getSeed()
public void setSeed(int newSeed)
newSeed
- the new random number seed.public Instances getDatasetFormat()
public void setDatasetFormat(Instances newDatasetFormat)
newDatasetFormat
- the new dataset format.public boolean[] getAttList_Irr()
public void setAttList_Irr(boolean[] newAttList_Irr)
newAttList_Irr
- array that defines the irrelevant attributes.public Instances defineDataFormat() throws java.lang.Exception
defineDataFormat
in class Generator
java.lang.Exception
- data format could not be definedpublic Instance generateExample() throws java.lang.Exception
generateExample
in class Generator
java.lang.Exception
- if format not defined or generating public Instances generateExamples() throws java.lang.Exception
generateExamples
in class Generator
java.lang.Exception
- if format not defined or generating public Instances generateExamples(int num, java.util.Random random, Instances format) throws java.lang.Exception
java.lang.Exception
- if format not defined or generating private Instance updateDecisionList(java.util.Random random, Instance example) throws java.lang.Exception
random
- random number generatorexample
- example used to update decision list
java.lang.Exception
private FastVector generateTestList(java.util.Random random, Instance example) throws java.lang.Exception
random
- random number generatorexample
-
java.lang.Exception
private Instance generateExample(java.util.Random random, Instances format) throws java.lang.Exception
random
- random number generator
java.lang.Exception
private boolean classifyExample(Instance example) throws java.lang.Exception
example
-
java.lang.Exception
private Instance votedReclassifyExample(Instance example) throws java.lang.Exception
example
- example to be reclassified
java.lang.Exception
private Instances defineDataset(java.util.Random random) throws java.lang.Exception
random
- random number generator
java.lang.Exception
private boolean[] defineIrrelevant(java.util.Random random)
random
-
private int[] defineNumeric(java.util.Random random)
random
-
public java.lang.String generateFinished() throws java.lang.Exception
generateFinished
in class Generator
java.lang.Exception
- no input structure has been definedprivate Instances voteDataset(Instances dataset) throws java.lang.Exception
dataset
-
java.lang.Exception
public static void main(java.lang.String[] argv)
argv
- should contain arguments for the data producer:
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |