Insegnamento a.a. 2021-2022

30416 - BIG DATA AND DATABASES

Department of Decision Sciences

Course taught in English
Go to class group/s: 25
BEMACS (6 credits - I sem. - OB  |  SECS-S/01)
Course Director:
LUCA MOLTENI

Classes: 25 (I sem.)
Instructors:
Class 25: LUCA MOLTENI


Suggested background knowledge

Basic descriptive and inferential statistics knowledge and basic computer skills.

Mission & Content Summary

MISSION

The course provides an overview of data management architectures and analytics procedures aimed at organising, describing and modeling Big Data (structured and unstructured). The contents of the course covers both technical aspects of data management / analytics and topics related to analysis managerial evaluation (how to translate the outputs into meaningful business insights).

CONTENT SUMMARY

  • Introduction to data management and analytics.
  • Data management architectures: relational databases (OLTP, Data warehouse and SQL language).
  • Data management architectures: Big data and NoSQL databases (distributed file system, Hadoop, Spark and Data Lake concept).
  • Data analytics: Data understanding and data preparation.
  • Data analytics: Models and statistical techniques applied to Big Data.
    • Regression and classification trees.
    • Ensemble methods (random forest and boosted trees).
    • Logistic regression.
    • Supervised Artificial Neural Networks.
  • Data analytics: Models' performance evaluation.

Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Get the following competences:

  • Big Data ingestion and management.
  • Data preparation and cleaning.
  • Machine learning algorithms application.
  • Machine learning model evaluation.

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Improve his skills to manage and to take advantages of the huge availability of data nowadays produced by a great variety of sources.

Teaching methods

  • Face-to-face lectures
  • Guest speaker's talks (in class or in distance)
  • Exercises (exercises, database, software etc.)
  • Case studies /Incidents (traditional, online)
  • Group assignments

DETAILS

Two different approaches are used: theoretical and applicative.

  • A number of data ingestion procedures and machine learning data analysis case histories are shown, on Big and Small Data, using specific data management anf machine learning software.
  • At the end of the course, students are able to reply all the procedures and analysis by themselves.

Assessment methods

  Continuous assessment Partial exams General exam
  • Oral individual exam
    x
  • Group assignment (report, exercise, presentation, project work etc.)
    x
  • Active class participation (virtual, attendance)
    x
  • Peer evaluation
    x

ATTENDING STUDENTS

The assignments assess the student's ability to perform data analysis. The oral exam aims mainly to assess the knowledge and the understanding of the procedures required for data ingestion and management and the main features of the machine learning algorithms presented in the course.


NOT ATTENDING STUDENTS

The oral exam aims mainly to assess the knowledge and the understanding of the procedures required for data ingestion and management, the main features of the machine learning algorithms presented in the course and the ability to interpret  the content of a machine learning software output.


Teaching materials


ATTENDING AND NOT ATTENDING STUDENTS

  • M. KUHN, K. JOHNSON, Applied predictive modeling, Springer, 2013.
  • P. WILTON, J. W. COLBY, Beginning SQL, Wrox, March 04, 2005.
  • N. DASGUPTA, Practical Big Data Analytics, Packt Publishing,  2018.
  • Teachers' slides.
Last change 22/10/2021 12:47