weka.core
Class Attribute

java.lang.Object
  extended byweka.core.Attribute
All Implemented Interfaces:
Copyable, java.io.Serializable

public class Attribute
extends java.lang.Object
implements Copyable, java.io.Serializable

Class for handling an attribute. Once an attribute has been created, it can't be changed.

Three attribute types are supported:

Typical usage (code from the main() method of this class):

...
// Create numeric attributes "length" and "weight"
Attribute length = new Attribute("length");
Attribute weight = new Attribute("weight");

// Create vector to hold nominal values "first", "second", "third"
FastVector my_nominal_values = new FastVector(3);
my_nominal_values.addElement("first");
my_nominal_values.addElement("second");
my_nominal_values.addElement("third");

// Create nominal attribute "position"
Attribute position = new Attribute("position", my_nominal_values);
...

Version:
$Revision: 1.29 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
(package private) static java.lang.String ARFF_ATTRIBUTE
          The keyword used to denote the start of an arff attribute declaration
(package private) static java.lang.String ARFF_ATTRIBUTE_DATE
          The keyword used to denote a date attribute
(package private) static java.lang.String ARFF_ATTRIBUTE_INTEGER
          A keyword used to denote a numeric attribute
(package private) static java.lang.String ARFF_ATTRIBUTE_NUMERIC
          A keyword used to denote a numeric attribute
(package private) static java.lang.String ARFF_ATTRIBUTE_REAL
          A keyword used to denote a numeric attribute
(package private) static java.lang.String ARFF_ATTRIBUTE_STRING
          The keyword used to denote a string attribute
static int DATE
          Constant set for attributes with date values.
private  java.text.SimpleDateFormat m_DateFormat
          Date format specification for date attributes
private  java.util.Hashtable m_Hashtable
          Mapping of values to indices (if nominal or string).
private  boolean m_HasZeropoint
          Whether the attribute has a zeropoint.
private  int m_Index
          The attribute's index.
private  boolean m_IsAveragable
          Whether the attribute is averagable.
private  boolean m_IsRegular
          Whether the attribute is regular.
private  double m_LowerBound
          The attribute's lower numeric bound.
private  boolean m_LowerBoundIsOpen
          Whether the lower bound is open.
private  ProtectedProperties m_Metadata
          The attribute's metadata.
private  java.lang.String m_Name
          The attribute's name.
private  int m_Ordering
          The attribute's ordering.
private  int m_Type
          The attribute's type.
private  double m_UpperBound
          The attribute's upper numeric bound.
private  boolean m_UpperBoundIsOpen
          Whether the upper bound is open
private  FastVector m_Values
          The attribute's values (if nominal or string).
private  double m_Weight
          The attribute's weight.
static int NOMINAL
          Constant set for nominal attributes.
static int NUMERIC
          Constant set for numeric attributes.
static int ORDERING_MODULO
          Constant set for modulo-ordered attributes.
static int ORDERING_ORDERED
          Constant set for ordered attributes.
static int ORDERING_SYMBOLIC
          Constant set for symbolic attributes.
static int STRING
          Constant set for attributes with string values.
private static int STRING_COMPRESS_THRESHOLD
          Strings longer than this will be stored compressed.
 
Constructor Summary
  Attribute(java.lang.String attributeName)
          Constructor for a numeric attribute.
  Attribute(java.lang.String attributeName, FastVector attributeValues)
          Constructor for nominal attributes and string attributes.
(package private) Attribute(java.lang.String attributeName, FastVector attributeValues, int index)
          Constructor for nominal attributes and string attributes with a particular index.
  Attribute(java.lang.String attributeName, FastVector attributeValues, ProtectedProperties metadata)
          Constructor for nominal attributes and string attributes, where metadata is supplied.
(package private) Attribute(java.lang.String attributeName, int index)
          Constructor for a numeric attribute with a particular index.
  Attribute(java.lang.String attributeName, ProtectedProperties metadata)
          Constructor for a numeric attribute, where metadata is supplied.
  Attribute(java.lang.String attributeName, java.lang.String dateFormat)
          Constructor for a date attribute.
(package private) Attribute(java.lang.String attributeName, java.lang.String dateFormat, int index)
          Constructor for date attributes with a particular index.
  Attribute(java.lang.String attributeName, java.lang.String dateFormat, ProtectedProperties metadata)
          Constructor for a date attribute, where metadata is supplied.
 
Method Summary
 int addStringValue(Attribute src, int index)
          Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string.
 int addStringValue(java.lang.String value)
          Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string.
(package private)  void addValue(java.lang.String value)
          Adds an attribute value.
 java.lang.Object copy()
          Produces a shallow copy of this attribute.
(package private)  Attribute copy(java.lang.String newName)
          Produces a shallow copy of this attribute with a new name.
(package private)  void delete(int index)
          Removes a value of a nominal or string attribute.
 java.util.Enumeration enumerateValues()
          Returns an enumeration of all the attribute's values if the attribute is nominal or a string, null otherwise.
 boolean equals(java.lang.Object other)
          Tests if given attribute is equal to this attribute.
(package private)  void forceAddValue(java.lang.String value)
          Adds an attribute value.
 java.lang.String formatDate(double date)
           
 double getLowerNumericBound()
          Returns the lower bound of a numeric attribute.
 ProtectedProperties getMetadata()
          Returns the properties supplied for this attribute.
 double getUpperNumericBound()
          Returns the upper bound of a numeric attribute.
 boolean hasZeropoint()
          Returns whether the attribute has a zeropoint and may be added meaningfully.
 int index()
          Returns the index of this attribute.
 int indexOfValue(java.lang.String value)
          Returns the index of a given attribute value.
 boolean isAveragable()
          Returns whether the attribute can be averaged meaningfully.
 boolean isDate()
          Tests if the attribute is a date type.
 boolean isInRange(double value)
          Determines whether a value lies within the bounds of the attribute.
 boolean isNominal()
          Test if the attribute is nominal.
 boolean isNumeric()
          Tests if the attribute is numeric.
 boolean isRegular()
          Returns whether the attribute values are equally spaced.
 boolean isString()
          Tests if the attribute is a string.
 boolean lowerNumericBoundIsOpen()
          Returns whether the lower numeric bound of the attribute is open.
static void main(java.lang.String[] ops)
          Simple main method for testing this class.
 java.lang.String name()
          Returns the attribute's name.
 int numValues()
          Returns the number of attribute values.
 int ordering()
          Returns the ordering of the attribute.
 double parseDate(java.lang.String string)
           
(package private)  void setIndex(int index)
          Sets the index of this attribute.
private  void setMetadata(ProtectedProperties metadata)
          Sets the metadata for the attribute.
private  void setNumericRange(java.lang.String rangeString)
          Sets the numeric range based on a string.
(package private)  void setValue(int index, java.lang.String string)
          Sets a value of a nominal attribute or string attribute.
 java.lang.String toString()
          Returns a description of this attribute in ARFF format.
 int type()
          Returns the attribute's type as an integer.
 boolean upperNumericBoundIsOpen()
          Returns whether the upper numeric bound of the attribute is open.
 java.lang.String value(int valIndex)
          Returns a value of a nominal or string attribute.
 double weight()
          Returns the attribute's weight.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

NUMERIC

public static final int NUMERIC
Constant set for numeric attributes.

See Also:
Constant Field Values

NOMINAL

public static final int NOMINAL
Constant set for nominal attributes.

See Also:
Constant Field Values

STRING

public static final int STRING
Constant set for attributes with string values.

See Also:
Constant Field Values

DATE

public static final int DATE
Constant set for attributes with date values.

See Also:
Constant Field Values

ORDERING_SYMBOLIC

public static final int ORDERING_SYMBOLIC
Constant set for symbolic attributes.

See Also:
Constant Field Values

ORDERING_ORDERED

public static final int ORDERING_ORDERED
Constant set for ordered attributes.

See Also:
Constant Field Values

ORDERING_MODULO

public static final int ORDERING_MODULO
Constant set for modulo-ordered attributes.

See Also:
Constant Field Values

ARFF_ATTRIBUTE

static java.lang.String ARFF_ATTRIBUTE
The keyword used to denote the start of an arff attribute declaration


ARFF_ATTRIBUTE_INTEGER

static java.lang.String ARFF_ATTRIBUTE_INTEGER
A keyword used to denote a numeric attribute


ARFF_ATTRIBUTE_REAL

static java.lang.String ARFF_ATTRIBUTE_REAL
A keyword used to denote a numeric attribute


ARFF_ATTRIBUTE_NUMERIC

static java.lang.String ARFF_ATTRIBUTE_NUMERIC
A keyword used to denote a numeric attribute


ARFF_ATTRIBUTE_STRING

static java.lang.String ARFF_ATTRIBUTE_STRING
The keyword used to denote a string attribute


ARFF_ATTRIBUTE_DATE

static java.lang.String ARFF_ATTRIBUTE_DATE
The keyword used to denote a date attribute


STRING_COMPRESS_THRESHOLD

private static final int STRING_COMPRESS_THRESHOLD
Strings longer than this will be stored compressed.

See Also:
Constant Field Values

m_Name

private java.lang.String m_Name
The attribute's name.


m_Type

private int m_Type
The attribute's type.


m_Values

private FastVector m_Values
The attribute's values (if nominal or string).


m_Hashtable

private java.util.Hashtable m_Hashtable
Mapping of values to indices (if nominal or string).


m_DateFormat

private java.text.SimpleDateFormat m_DateFormat
Date format specification for date attributes


m_Index

private int m_Index
The attribute's index.


m_Metadata

private ProtectedProperties m_Metadata
The attribute's metadata.


m_Ordering

private int m_Ordering
The attribute's ordering.


m_IsRegular

private boolean m_IsRegular
Whether the attribute is regular.


m_IsAveragable

private boolean m_IsAveragable
Whether the attribute is averagable.


m_HasZeropoint

private boolean m_HasZeropoint
Whether the attribute has a zeropoint.


m_Weight

private double m_Weight
The attribute's weight.


m_LowerBound

private double m_LowerBound
The attribute's lower numeric bound.


m_LowerBoundIsOpen

private boolean m_LowerBoundIsOpen
Whether the lower bound is open.


m_UpperBound

private double m_UpperBound
The attribute's upper numeric bound.


m_UpperBoundIsOpen

private boolean m_UpperBoundIsOpen
Whether the upper bound is open

Constructor Detail

Attribute

public Attribute(java.lang.String attributeName)
Constructor for a numeric attribute.

Parameters:
attributeName - the name for the attribute

Attribute

public Attribute(java.lang.String attributeName,
                 ProtectedProperties metadata)
Constructor for a numeric attribute, where metadata is supplied.

Parameters:
attributeName - the name for the attribute
metadata - the attribute's properties

Attribute

public Attribute(java.lang.String attributeName,
                 java.lang.String dateFormat)
Constructor for a date attribute.

Parameters:
attributeName - the name for the attribute
dateFormat - a string suitable for use with SimpleDateFormatter for parsing dates.

Attribute

public Attribute(java.lang.String attributeName,
                 java.lang.String dateFormat,
                 ProtectedProperties metadata)
Constructor for a date attribute, where metadata is supplied.

Parameters:
attributeName - the name for the attribute
dateFormat - a string suitable for use with SimpleDateFormatter for parsing dates.
metadata - the attribute's properties

Attribute

public Attribute(java.lang.String attributeName,
                 FastVector attributeValues)
Constructor for nominal attributes and string attributes. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string.

Parameters:
attributeName - the name for the attribute
attributeValues - a vector of strings denoting the attribute values. Null if the attribute is a string attribute.

Attribute

public Attribute(java.lang.String attributeName,
                 FastVector attributeValues,
                 ProtectedProperties metadata)
Constructor for nominal attributes and string attributes, where metadata is supplied. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string.

Parameters:
attributeName - the name for the attribute
attributeValues - a vector of strings denoting the attribute values. Null if the attribute is a string attribute.
metadata - the attribute's properties

Attribute

Attribute(java.lang.String attributeName,
          int index)
Constructor for a numeric attribute with a particular index.

Parameters:
attributeName - the name for the attribute
index - the attribute's index

Attribute

Attribute(java.lang.String attributeName,
          java.lang.String dateFormat,
          int index)
Constructor for date attributes with a particular index.

Parameters:
attributeName - the name for the attribute
dateFormat - a string suitable for use with SimpleDateFormatter for parsing dates. Null for a default format string.
index - the attribute's index

Attribute

Attribute(java.lang.String attributeName,
          FastVector attributeValues,
          int index)
Constructor for nominal attributes and string attributes with a particular index. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string.

Parameters:
attributeName - the name for the attribute
attributeValues - a vector of strings denoting the attribute values. Null if the attribute is a string attribute.
index - the attribute's index
Method Detail

copy

public java.lang.Object copy()
Produces a shallow copy of this attribute.

Specified by:
copy in interface Copyable
Returns:
a copy of this attribute with the same index

enumerateValues

public final java.util.Enumeration enumerateValues()
Returns an enumeration of all the attribute's values if the attribute is nominal or a string, null otherwise.

Returns:
enumeration of all the attribute's values

equals

public final boolean equals(java.lang.Object other)
Tests if given attribute is equal to this attribute.

Parameters:
other - the Object to be compared to this attribute
Returns:
true if the given attribute is equal to this attribute

index

public final int index()
Returns the index of this attribute.

Returns:
the index of this attribute

indexOfValue

public final int indexOfValue(java.lang.String value)
Returns the index of a given attribute value. (The index of the first occurence of this value.)

Parameters:
value - the value for which the index is to be returned
Returns:
the index of the given attribute value if attribute is nominal or a string, -1 if it is numeric or the value can't be found

isNominal

public final boolean isNominal()
Test if the attribute is nominal.

Returns:
true if the attribute is nominal

isNumeric

public final boolean isNumeric()
Tests if the attribute is numeric.

Returns:
true if the attribute is numeric

isString

public final boolean isString()
Tests if the attribute is a string.

Returns:
true if the attribute is a string

isDate

public final boolean isDate()
Tests if the attribute is a date type.

Returns:
true if the attribute is a date type

name

public final java.lang.String name()
Returns the attribute's name.

Returns:
the attribute's name as a string

numValues

public final int numValues()
Returns the number of attribute values. Returns 0 for numeric attributes.

Returns:
the number of attribute values

toString

public final java.lang.String toString()
Returns a description of this attribute in ARFF format. Quotes strings if they contain whitespace characters, or if they are a question mark.

Returns:
a description of this attribute as a string

type

public final int type()
Returns the attribute's type as an integer.

Returns:
the attribute's type.

value

public final java.lang.String value(int valIndex)
Returns a value of a nominal or string attribute. Returns an empty string if the attribute is neither nominal nor a string attribute.

Parameters:
valIndex - the value's index
Returns:
the attribute's value as a string

addStringValue

public int addStringValue(java.lang.String value)
Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string.

Parameters:
value - The string value to add
Returns:
the index assigned to the string, or -1 if the attribute is not of type Attribute.STRING

addStringValue

public int addStringValue(Attribute src,
                          int index)
Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string. This method is more efficient than addStringValue(String) for long strings.

Parameters:
src - The Attribute containing the string value to add.
Returns:
the index assigned to the string, or -1 if the attribute is not of type Attribute.STRING

addValue

final void addValue(java.lang.String value)
Adds an attribute value. Creates a fresh list of attribute values before adding it.

Parameters:
value - the attribute value

copy

final Attribute copy(java.lang.String newName)
Produces a shallow copy of this attribute with a new name.

Parameters:
newName - the name of the new attribute
Returns:
a copy of this attribute with the same index

delete

final void delete(int index)
Removes a value of a nominal or string attribute. Creates a fresh list of attribute values before removing it.

Parameters:
index - the value's index
Throws:
java.lang.IllegalArgumentException - if the attribute is not nominal

forceAddValue

final void forceAddValue(java.lang.String value)
Adds an attribute value.

Parameters:
value - the attribute value

setIndex

final void setIndex(int index)
Sets the index of this attribute.


setValue

final void setValue(int index,
                    java.lang.String string)
Sets a value of a nominal attribute or string attribute. Creates a fresh list of attribute values before it is set.

Parameters:
index - the value's index
string - the value
Throws:
java.lang.IllegalArgumentException - if the attribute is not nominal or string.

formatDate

public java.lang.String formatDate(double date)

parseDate

public double parseDate(java.lang.String string)
                 throws java.text.ParseException
Throws:
java.text.ParseException

getMetadata

public final ProtectedProperties getMetadata()
Returns the properties supplied for this attribute.

Returns:
metadata for this attribute

ordering

public final int ordering()
Returns the ordering of the attribute. One of the following: ORDERING_SYMBOLIC - attribute values should be treated as symbols. ORDERING_ORDERED - attribute values have a global ordering. ORDERING_MODULO - attribute values have an ordering which wraps.

Returns:
the ordering type of the attribute

isRegular

public final boolean isRegular()
Returns whether the attribute values are equally spaced.

Returns:
whether the attribute is regular or not

isAveragable

public final boolean isAveragable()
Returns whether the attribute can be averaged meaningfully.

Returns:
whether the attribute can be averaged or not

hasZeropoint

public final boolean hasZeropoint()
Returns whether the attribute has a zeropoint and may be added meaningfully.

Returns:
whether the attribute has a zeropoint or not

weight

public final double weight()
Returns the attribute's weight.

Returns:
the attribute's weight as a double

getLowerNumericBound

public final double getLowerNumericBound()
Returns the lower bound of a numeric attribute.

Returns:
the lower bound of the specified numeric range

lowerNumericBoundIsOpen

public final boolean lowerNumericBoundIsOpen()
Returns whether the lower numeric bound of the attribute is open.

Returns:
whether the lower numeric bound is open or not (closed)

getUpperNumericBound

public final double getUpperNumericBound()
Returns the upper bound of a numeric attribute.

Returns:
the upper bound of the specified numeric range

upperNumericBoundIsOpen

public final boolean upperNumericBoundIsOpen()
Returns whether the upper numeric bound of the attribute is open.

Returns:
whether the upper numeric bound is open or not (closed)

isInRange

public final boolean isInRange(double value)
Determines whether a value lies within the bounds of the attribute.

Returns:
whether the value is in range

setMetadata

private void setMetadata(ProtectedProperties metadata)
Sets the metadata for the attribute. Processes the strings stored in the metadata of the attribute so that the properties can be set up for the easy-access metadata methods. Any strings sought that are omitted will cause default values to be set. The following properties are recognised: ordering, averageable, zeropoint, regular, weight, and range. All other properties can be queried and handled appropriately by classes calling the getMetadata() method.

Parameters:
metadata - the metadata
Throws:
java.lang.IllegalArgumentException - if the properties are not consistent

setNumericRange

private void setNumericRange(java.lang.String rangeString)
Sets the numeric range based on a string. If the string is null the range will default to [-inf,+inf]. A square brace represents a closed interval, a curved brace represents an open interval, and 'inf' represents infinity. Examples of valid range strings: "[-inf,20)","(-13.5,-5.2)","(5,inf]"

Parameters:
rangeString - the string to parse as the attribute's numeric range
Throws:
java.lang.IllegalArgumentException - if the range is not valid

main

public static void main(java.lang.String[] ops)
Simple main method for testing this class.