Insegnamento a.a. 2025-2026

20570 - DATA ANALYTICS AND VISUALIZATION

Department of Decision Sciences


Student consultation hours
Class timetable
Exam timetable

Course taught in English
Go to class group/s: 27 - 28
EMIT (8 credits - II sem. - OB  |  SECS-S/01)
Course Director:
RAFFAELLA PICCARRETA

Classes: 27 (II sem.) - 28 (II sem.)
Instructors:
Class 27: FILIPPO TRENTINI, Class 28: RAFFAELLA PICCARRETA


Suggested background knowledge

For an effective learning experience, it is strongly recommended to have basic notions of statistics (as taught at undergraduate school in Bocconi), in particular of univariate and bivariate descriptive statistics (graphs - in particular boxplots; summary measures - in particular mean, variance, covariance, correlation) and of basic inference (in particular hypotheses testing and p-value). Also, a basic knowledge of software R (RStudio) is expected (at least R objects - vectors, matrices, dataframes, lists - and basic functions). The preparatory course 20354 provides material to get aligned, and includes online tests to verify the level of knowledge and understanding of the concepts used during the course. Online meetings are organized on September and at the end of January/beginning of February.

Mission & Content Summary

MISSION

Modern graduates need to use data to a much greater extent compared to their past counterparts. Data management (retrieving, filtering, or cleaning), exploratory data analysis, and appropriate data visualization are becoming more and more relevant in any field. In this course, students are introduced to problems related to the extraction of information from data collected on a relevant number of variables and cases, and gain an applied understanding of the most relevant techniques of data analytics, with specific reference to unsupervised learning. The key goal of the course is to illustrate methods useful to analyse and summarize the most salient features of data sets with respect to both the variables and the cases. The course features hands-on classes, where the application of each techniques is discussed with reference to real datasets.

CONTENT SUMMARY

Data analytics is a broad term that defines the activities in the process of analysing data to draw meaningful and actionable insights. It involves a number of steps and procedures, including:

•        Data manipulation and analysis, aimed at discovering the salient patterns in data

•        Visualisation (e.g. effective presentation) of results, interpretation and communication to stakeholders, in order to drive business strategy and outcomes.

 

The course introduces exploratory techniques to efficiently analyse, summarize and visualize data collected on (relatively) large sets of data. The goal is to reduce the dimension of data while preserving information about the most salient/distinctive features. Such simplification applies both to variables and to cases.

 

The course is articulated as follows:

·        Introduction to multivariate data

In the first part of the course, summaries of data collected on many variables will be introduced, by extending to the multivariate case central tendency and dispersion measures

·        Dimensionality reduction techniques

We will introduce Principal Components and Factor analysis, two techniques aimed at discovering low-dimensional indicators/summaries that capture some structure underlying the (possibly high-dimensional) input data

·        Clustering techniques

The last part of the course introduces techniques to group cases based on their similarities or differences.

 

Beyond traditional classes, the course features hands-on classes, where the statistical software R - and in particular the integrated development environment (IDE) RStudio - is used to apply the considered techniques (Principal Components Analysis, Factor Analysis, and Cluster Analysis) to real data, and to properly interpret and present results, via suitable visualisation tools.


Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

·        Identify the technique most suitable to simplify relevant information in a dataset with reference to a specific goal of analysis.

·        Recognize appropriate and inappropriate applications and approaches with reference to a specific goal of analysis.

·        Justify the adoption of a specific path of analysis and of the choices made during the analysis.

·        Compare the results obtained using different approaches, evaluate the stability of results.

·        Write R scripts to analyse data

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

·        Design/develop scripts in the R-programming language to read, manipulate, analyse and visualise data

·        Interpret and critically analyse results, emphasizing the most relevant conclusions both from a technical and from an interpretative point of view.

·        Effectively present the output, using suitable visualization tools allowing an immediate and unbiased understanding of the most salient features in data.


Teaching methods

  • Lectures
  • Practical Exercises
  • Collaborative Works / Assignments

DETAILS

During the course, there will be 3 blocks of hands-on classes, one for each of the three techniques taught during the course. For each technique, teams of students will work on an assignment concerning a substantive problem using data analysis.

Such assignments aim at assessing the ability to design a work flow to analyse data using the software R, as well as the ability to draw substantive conclusions based on the software output.

During each hands-on class, teams will answer to the specific questions presented in class writing a memorandum uploaded on Bboard by the end of the class.

Each block of hands-on classes will be followed by a session where individual tests will be administered containing questions on the theoretical aspects of the considered technique, on the results obtained during the hands-on classes, and on the aspects taken into account to develop the analysis presented by the students with their team.

Students who actively participate to groups work and give individual tests can give the exam as attending (see the session on assessment methods for details and rules)


Assessment methods

  Continuous assessment Partial exams General exam
  • Written individual exam (traditional/online)
    x
  • Collaborative Works / Assignment (report, exercise, presentation, project work etc.)
x    
  • Peer evaluation
x    

ATTENDING STUDENTS

Effective class participation includes attendance, preparation, making an active and constructive contribution to the class discussion, asking questions, making constructive comments, and having a positive attitude toward learning.

Students are considered as attending if they participate to the activities described below.

  • During the course, there will be 3 blocks of 3 hands-on classes (thus 9 hands-on classes in total). Each block of hands-on classes will focus on one of the techniques taught during the course. For each technique, teams of students (formed by the instructors) will work on an assignment concerning a substantive problem using data analysis.
    Such assignments aim at assessing the ability to design a work flow to analyse data using the software R, as well as the ability to draw substantive conclusions based on the software output.
    Students must be able and ready to contribute to their team’s assignment, both with respect to the R-commands needed to perform the required analyses and with respect to the knowledge of the technique, in order to contribute both to the definition of the path of analysis and to the interpretation and critical evaluation of the obtained results. During each block of hands-on classes, teams will answer to the specific questions presented in class writing a memorandum uploaded on Bboard by the end of each class. To receive the points assigned to their teams, students must compulsory attend at least 2 of the three hands-on classes in a given block.  Students who miss one of the three sessions will receive 2/3 of the grade assigned to their group.   
  • Each block of hands-on classes will be followed by a session where individual tests will be administered containing questions on the theoretical aspects of the considered technique as well as questions concerning the data analysed during the hands-on classes. Students are allowed to take these tests even if they missed more than one hands-on class.
    Such tests (on Blackboard) aim at assessing the knowledge on the techniques introduced in the course, also with respect to the obtained output. They last 45 minutes.

 

The assessment of attending students who take the exam in one of the first two sessions  is based on three main components:

  • The team assignments will count for the 20% of the final grade (6 points overall, 2 points for each block of hands-on classes). Students should be aware that a peer review process will be in place, and that critical situations reported by peers might imply substantial reduction of the final grade
  • The individual tests taken during the course will count for the 30% of the final grade (9 points overall, 3 points for each individual test)
  • A final practical exam (denoted as S - scritto - on the Bocconi website) at the end of the course – counting for the 50% of the final grade (15 points overall) – consisting in an in-class (lab) computer assignment. This exam is the same as for not attending students. It lasts 2.30 hours and concerns the application of one of the techniques seen in the course. 
    Students will use their own laptop to analyse a set of data using the techniques illustrated during the course, writing a script from the scratch using the software R and preparing a short report with their analysis, also offering a substantive interpretation of the obtained results. 
    The exam aims at assessing the individual ability to apply the techniques illustrated during the course, to coherently design a work flow to analyse data using the software R and to draw substantive conclusions on the data at hand based on the software output.

 

Below are the dates/hours scheduled for the 3 hands on classes and the individual tests for the three modules:

 

PCA: March 3, 4,5 + individual test on March 6 

FA: March 31, April 1,2 + Individual test on April 3

CA: May 6,7,8 + individual test on May 12

 

Important:

  • Students can take the exam as attending only at the first 2 exam sessions
  • Students who qualify as attending can in any case decide to take the exam as not attending. However, registering to the exam as not attending implies losing the qualification of attending student.
  • Students who register to the exam as attending can take the exam as not-attending at subsequent exam sessions
  • Withdrawal or failure do not imply losing the status of attending student
  • Students who skip more than one block of hands-on classes and more than one individual test cannot qualify as attending. 
  • Students who skip the individual test will not be allowed to retake it
  • Students who sign as attending and are not present in class - besides the consequences stated in the honour code - will not be allowed to take the exam as attending students. 
  • There is no midterm exam.
  • To be admitted to the final exam it is mandatory to register to it. No exception will be made to this rule
  • Students of the past years who already sat for the final (practical) exam and/or who participated to the teams assignment in the past years cannot qualify as attending. This is in line with the rules stated in the syllabi of the past years. The same rule will apply to the students enrolled in the current academic year.

NOT ATTENDING STUDENTS

The exam for not attending students  (or for attending students who decide - for any reason to take the exam as not attending) will be articulated as follows.

  • A final practical exam (denoted as S - scritto - on the Bocconi website) at the end of the course – counting for the 70% of the final grade (21points ) – consisting in an in-class (lab) computer assignment. This exam is the same as for attending students. It lasts 2.30 hours and concerns the application of one of the techniques seen in the course. 
    Students will use their own laptop to analyse a set of data using the techniques illustrated during the course, writing a script from the scratch using the software R and preparing a short report with their analysis, also offering a substantive interpretation of the obtained results. 
    The exam aims at assessing the individual ability to apply the techniques illustrated during the course, to coherently design a work flow to analyse data using the software R and to draw substantive conclusions on the data at hand based on the software output.a
  • A theoretical exam counting for the 30% of the final grade (9 points).  A test on Blackboard (Respondus is needed) will be administered containing questions on the theoretical aspects of the considered techniques as well as questions concerning the analysis and interpretation of outputs. Such test aims at assessing the knowledge on the techniques introduced in the course, also with respect to the obtained output; it lasts 1 hour and will follow the practical exam (after a short break)  

 

Important:

  • There is no midterm exam.
  • To be admitted to the final exam it is mandatory to register to it. No exception will be made to this rule
  • Students of the past years who already sat for the final (practical) exam and/or who participated to the teams assignment in the past years must take the exam as not attending. This is in line with the rules stated in the syllabi of the past years, and is coherent with the structure of the exam in the past years.

 


Teaching materials


ATTENDING AND NOT ATTENDING STUDENTS

The slides of the course are quite complete. Interested students will be provided with a list of textbooks offring more detailed description of the considered techniques

Last change 30/11/2025 12:13