Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

Package de.lmu.ifi.dbs.elki.parser

Parsers for different file formats and data types.

See:
          Description

Interface Summary
DistanceParser<O extends DatabaseObject,D extends Distance<D>> A DistanceParser shall provide a DistanceParsingResult by parsing an InputStream.
LinebasedParser<O extends DatabaseObject> A parser that can parse single line.
Parser<O extends DatabaseObject> A Parser shall provide a ParsingResult by parsing an InputStream.
 

Class Summary
AbstractParser<O extends DatabaseObject> Abstract superclass for all parsers providing the option handler for handling options.
BitVectorLabelParser Provides a parser for parsing one BitVector per line, bits separated by whitespace.
DistanceParsingResult<O extends DatabaseObject,D extends Distance<D>> Provides a list of database objects and labels associated with these objects and a cache of precomputed distances between the database objects.
DoubleVectorLabelParser Provides a parser for parsing one point per line, attributes separated by whitespace.
DoubleVectorLabelTransposingParser Parser reads points transposed.
FloatVectorLabelParser Provides a parser for parsing one point per line, attributes separated by whitespace.
NumberDistanceParser<D extends NumberDistance<D,N>,N extends Number> Provides a parser for parsing one distance value per line.
ParameterizationFunctionLabelParser Provides a parser for parsing one point per line, attributes separated by whitespace.
ParsingResult<O extends DatabaseObject> Provides a list of database objects and labels associated with these objects.
RealVectorLabelParser<V extends RealVector<?,?>> Provides a parser for parsing one point per line, attributes separated by whitespace.
SparseBitVectorLabelParser Provides a parser for parsing one sparse BitVector per line, where the indices of the one-bits are separated by whitespace.
SparseFloatVectorLabelParser Provides a parser for parsing one point per line, attributes separated by whitespace.
 

Package de.lmu.ifi.dbs.elki.parser Description

Parsers for different file formats and data types.

The general use-case for any parser is to create DatabaseObjects out of an InputStream (e.g. by reading a data file). The DatabaseObjects are packed in a ParsingResult-Object which, in turn, is used by a DatabaseConnection-Object to create a Database containing the corresponding objects.

By default (i.e., if the user does not specify any specific requests), any KDDTask will use the FileBasedDatabaseConnection which, in turn, will use the DoubleVectorLabelParser to parse a specified data file creating a SequentialDatabase containing DoubleVector-Objects.

Thus, the standard procedure to use a data set of a real-valued vector space is to prepare the data set in a file of the following format (as suitable to DoubleVectorLabelParser):

This file format is e.g. also suitable to gnuplot.

As an example file following these requirements consider e.g.: exampledata.txt


Release 0.2.1 (2009-07-13_1605)