Search:
Lehrstuhl  |  Institut  |  Fakultät  |  LMU
print

Online Discovery of Topics and Subtopics in a Stream of News Data

(project, diploma, bachelor's, master's thesis)

Reports on major events like hurricanes and earthquakes, and major topics like the financial crisis or the Egyptian revolution appear in Internet news and become (ir)regularly updated, as new insights are acquired.

The goal of this project is to maintain online a hierarchy of topics and subtopics as the stream of news evolves over time. New topics or subtopics of an existing topic might arise over time. Or, existing topics/ subtopics might detoriate. Or, a subtopic might be "upgraded" to a topic etc. We use clustering for the detection of the hierarchy of topics and subtopics. The data for the analysis come from Twitter.

Tasks

  • Development of a hierarchical clustering algorithm for the detection of the hierarchy of topics and subtopics.
  • Online maintenance of the hierarchy as new reports/tweets arrive over time.

The new methods should be incorporated into the existing code, from our recent paper ("Discovering Global and Local Bursts in a Stream of News Data" SAC’12, Data Stream track).


Requirements

  • Good programming skills (Java)
  • Knowledge of basic KDD concepts
  • Independent work

Contact
If you are interested in this topic and/or if you have further questions please contact: Irene Ntoutsi (This is joint project with Prof. Myra Spiliopoulou and Max ZimmerMann at the University of Magdeburg)

Note that communication and cooperation would be in English.

blank
Datenschutz   Impressum