Lehr- und Forschungseinheit für Datenbanksysteme
print


Breadcrumb Navigation


Content

Big Data Management and Analytics (WS 2017/18)

News

  • The results of the exam can be seen in UniWorX. The insight into the exam is scheduled for Thu 26.04.2018 16:00 (s.t.) -17:00 in the DBS conference room 157.
  • The follow-up exam is scheduled for Mon. 09.04.2018 16:00-18:00 in M 218 (Main building). The registration in UniWorX for the follow-up exam is open.
  • The results of the exam can be seen in UniWorX. The insight into the exam is scheduled for Thu 29.03.2018 10:00 (s.t.) -11:30 in the DBS conference room 157.
  • There will be a follow-up exam towards the end of the semester or the start of the new semester. Time and place will be announced soon.
  • Announcements for the upcoming exam - With name assignments!
  • A computation error in the slides for exercise 7-2 have been corrected. Further, more detailed computation of the single steps in 7-2 are provided.
  • Compensation for disadvantages (Nachteilsausgleich): All students who are eligible to get an extension of the writing time (Schreibzeitverlängerung) please report it to us until at latest 31.01.2018.
  • The exam will be on Wed. 14.02.2018 12:00-14:00. You can register for the exam on UniWorX.
  • We now also provide the quizzes from the tutorials online!
  • The video for the lectures are online. go to videoonline.edu.lmu.de
  • Course registration in UNIWORX is now open (here)

Organisation

  • Umfang: 3+2 hours weekly (equals 6 ECTS)
  • Lecture: Prof. Dr. Matthias Schubert
  • Assistants: Julian Busch, Daniyal Kazempour
  • Required: Lecture "Database Systems I" or equivalent
  • Beneficial: Lecture "Knowledge Discovery in Databases I" or equivalent
  • Audience: The lecture is directed towards Bachelor students (5th term) and Master students in Mediainformatics, Bioinformatics, and Informatics

Time and Locations

All times are c.t. (cum tempore)

Component When Where Starts at
Lecture Tue, 13.00 - 16.00 h Room S 004 (Schellingstr. 3) 17.10.2017
Tutorial 1 Wed, 14.00 - 16.00 h Room D Z007 (HGB) 25.10.2017
Tutorial 2 Wed, 16.00 - 18.00 h Room D Z007 (HGB) 25.10.2017
Tutorial 3 Thu, 16.00 - 18.00 h Room B 185 (Edmund-Rumpler-Str. 13) 26.10.2017
Tutorial 4 Thu, 14.00 - 16.00 h Room B 185 (Edmund-Rumpler-Str. 13) 26.10.2017

Content

In almost all areas of business, industry, science, and everybody's life, the amount of available data that contains value and knowledge is immense and fast growing. However, turning data into information, information into knowledge, and knowledge into value is challenging.To extract the knowledge, the data needs to be stored, managed, and analyzed. Thereby, we not only have to cope with increasing amount of data, but also with increasing velocity, i.e., data streamed in high rates, with heterogeneous data sources and also more and more have to take data quality and reliability of data and information into account. These properties referring to the four V's (Volume, Velocity, Variety, and Veracity) are the key properties of "Big Data". Big Data grows faster than our ability to process the data, so we need new architectures, algorithms and approaches for managing, processing, and analyzing Big Data that goes beyond traditional concepts for knowledge discovery and data mining.

This course introduces Big Data, challenges associated with Big Data, and basic concepts for Big Data Management and Big Data Analytics which are important components in the new and popular field Data Science.

 

Course Schedule

LectureTutorial
DateTopicDateTopic
17.10.17 Lecture 1: Data Science: The Big Picture ---
24.10.17 Lecture 2: NoSQL Databases 25.10.17
26.10.17

Tutorial 1

Solution

31.10.17 no lecture ( 500th reformation day) 01.11.17
02.11.17
no tutorials for this week
07.11.17 Lecture 3: Batch Systems 08.11.17
09.11.17

Tutorial 2

movie dataset

blob dataset

mouse dataset

Solution

14.11.17 Lecture 4: : Apache Spark 15.11.17
16.11.17

 Tutorial 3

Slides for Tutorial 3

21.11.17 Lecture 5: Stream Processing 22.11.17
23.11.17

 Tutorial 4

customercookies.csv

kMeans_template.py

Solutions

Quiz

28.11.17 Lecture 6: Apache Flink 29.11.17
30.11.17

 Tutorial 5

 Solutions/Slides

 Quiz

05.12.17 Lecture 7: Stream Analytics 06.12.17
07.12.17

 Tutorial 6

 mm_flink_template.java

 mm_flink_solution.java

 WordCountDataSet.java

 WordCountDataStream.java

 Slides

 Quiz

12.12.17 Lecture 7: Stream Analytics cont. 13.12.17
14.12.17

 Tutorial 7

 Slides

 Quiz

19.12.17 Lecture 8: High Dimenasional Data 20.12.17
21.12.17

 Tutorial 8

 Slides

 Quiz

09.01.18 Lecture 8: High Dimenasional Data (cont.) 10.01.18
11.01.18

 Tutorial 9

 Slides

 PowerIteration Code

 Quiz

16.01.18 Lecture 9: Community Detection 17.01.18
18.01.18

 Tutorial 10

 Slides

 Steam SVD Code

 Quiz

 

23.01.18 Lecture 10: Node Importance and Neighborhoods 24.01.18
25.01.18

 Tutorial 11

 Slides

 Quiz

30.01.18 Lecture 11: The Flip Side of the Coin 31.01.18
01.02.18

 Tutorial 12

 Slides

06.02.18 Question & Answers 07.01.18
08.02.18
no tutorials for this week