|
|
|||||||||||||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
DistanceParser<O extends DatabaseObject,D extends Distance<D>> | A DistanceParser shall provide a DistanceParsingResult by parsing an InputStream. |
LinebasedParser<O extends DatabaseObject> | A parser that can parse single line. |
Parser<O extends DatabaseObject> | A Parser shall provide a ParsingResult by parsing an InputStream. |
Class Summary | |
---|---|
AbstractParser<O extends DatabaseObject> | Abstract superclass for all parsers providing the option handler for handling options. |
BitVectorLabelParser | Provides a parser for parsing one BitVector per line, bits separated by whitespace. |
DistanceParsingResult<O extends DatabaseObject,D extends Distance<D>> | Provides a list of database objects and labels associated with these objects and a cache of precomputed distances between the database objects. |
DoubleVectorLabelParser | Provides a parser for parsing one point per line, attributes separated by whitespace. |
DoubleVectorLabelTransposingParser | Parser reads points transposed. |
FloatVectorLabelParser | Provides a parser for parsing one point per line, attributes separated by whitespace. |
NumberDistanceParser<D extends NumberDistance<D,N>,N extends Number> | Provides a parser for parsing one distance value per line. |
ParameterizationFunctionLabelParser | Provides a parser for parsing one point per line, attributes separated by whitespace. |
ParsingResult<O extends DatabaseObject> | Provides a list of database objects and labels associated with these objects. |
RealVectorLabelParser<V extends RealVector<?,?>> | Provides a parser for parsing one point per line, attributes separated by whitespace. |
SparseBitVectorLabelParser | Provides a parser for parsing one sparse BitVector per line, where the indices of the one-bits are separated by whitespace. |
SparseFloatVectorLabelParser | Provides a parser for parsing one point per line, attributes separated by whitespace. |
Parsers for different file formats and data types.
The general use-case for any parser is to create
DatabaseObject
s out of an
InputStream
(e.g. by reading a data file).
The DatabaseObject
s are packed in a
ParsingResult
-Object which,
in turn, is used by a DatabaseConnection
-Object
to create a Database
containing the corresponding objects.
By default (i.e., if the user does not specify any specific requests),
any KDDTask
will
use the FileBasedDatabaseConnection
which,
in turn, will use the
DoubleVectorLabelParser
to parse a specified data file creating
a SequentialDatabase
containing DoubleVector
-Objects.
Thus, the standard procedure to use a data set of a real-valued vector space
is to prepare the data set in a file of the following format
(as suitable to DoubleVectorLabelParser
):
As an example file following these requirements consider e.g.: exampledata.txt
|
|
|||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |