Foto sezione
Logo Bocconi

Course 2023-2024 a.y.


Department of Management and Technology

Course taught in English

Go to class group/s: 27 - 28

EMIT (7 credits - I sem. - OB  |  SECS-P/06)
Course Director:

Classes: 27 (I sem.) - 28 (I sem.)

Synchronous Blended: Lezioni erogate in modalità sincrona in aula (max 1 ora per credito online sincrona)

Mission & Content Summary

This course aims to provide students with a strong foundation in the principles and techniques of data analysis and econometric modeling. The first part will cover topics such as data manipulation, data visualization, and statistical analysis using Python, one of the most popular programming languages. In this part of the course, students will learn how to write efficient and effective code to analyze and interpret data and will develop the ability to communicate their findings in a clear and concise manner. The second part of the course will cover statistical inference and regression analysis. Students will learn how to apply these techniques to real-world data sets and will develop the skills necessary to interpret and communicate their findings effectively. Throughout the course, a special emphasis will be put on the importance of critical thinking and problem-solving in the field of data analysis and econometrics. By the end of the course, students will be equipped with the knowledge and skills necessary to conduct rigorous data analysis and econometric modeling in a variety of contexts in their future academic or professional pursuits.


Part I

A. Essentials of coding

1.     Installation and setup (IDE and editors, Jupyter notebooks, use on servers)

2.     Basics of Python language

─       Variables

─       Built-in data types: numeric, boolean, strings

─       Containers: lists, dictionaries, tuples, set

─       Operators: assignment, logical comparison, Boolean

3.     Control flow statements

─       if, elif, else

─       for loop

─       while loop

─       continue, break

─       try, except

4.     Functions and modules

─       Custom defined functions: keywords, docstring, variable scope

─       Anonymous (lambda) functions

─       Packages and modules

B. Data manipulation

─       Pandas library

─       Series and DataFrames

─       Indexing

─       Essential functionalities (filtering, renaming, viewing, handling missing etc.)

─       Import and export data: data formats (csv, stata, excel, json, etc.)

─       Import data from databases (e.g. MySql), import data from URL (use of APIs)

─       Data wrangling: operations on numeric, operations on strings

─       Aggregating data and by group operations

─       Iterating dataframes: apply functions

─       Merge, join, append, concat data

─       Reshape: stack, unstack, pivot

C. Data visualization

─       Plotting data: Matplotlib and other plotting libraries

─       Common plots

D. Introduction to text analysis

─       Regular expressions

─       NLTK and spacy libraries

─       Text representation: TF-IDF, word embedding


Part II

A. The nature of econometric data

·       Data generation process: Experimental Data  vs Nonexperimental Data

·        Data Types: Time-Series Data, Cross-Section Data, Panel or Longitudinal Data 8

·        The Research Process and the set-up of an econometric model

B. The simple linear regression model

·         Economic Model and Econometric Model: Introducing the Error Term

·         Estimating the Regression Parameters: The Least Squares Principle

·         Interpreting Estimates

·         Non linear relationships and indicator variables

C. Interval Estimation and Hypothesis Testing

·        Interval estimation

·        Hypothesis testing

·        P-value and the rejection region

D. The Multiple Regression Model

·        Estimation and inference

E. Qualitative and Binary Dependent Variable Models

·        LPM, Logit and Probit

F. Panel Data Models

·       Pooled model, Fixed effect model and Random effect model

E. A primer on causal identification

Intended Learning Outcomes (ILO)
At the end of the course student will be able to...

·        Understanding the principles and techniques of coding for data analysis and econometric modeling.

·        Developing proficiency in using popular programming languages such as Python and Stata for data analysis.

·        Gaining the ability to manipulate, analyze, and visualize data.

·        Enhancing skills in writing efficient and effective code for data analysis and econometric modeling.

·        Learning how to interpret and communicate findings effectively.

·        Understanding the importance of reproducibility and collaboration in data analysis and econometric modeling.

At the end of the course student will be able to...

·        Developing the ability to apply coding and econometric techniques to real-world data sets.

·        Enhancing critical thinking and problem-solving skills, especially in the field of data analysis and econometrics.

·        Being able to conduct rigorous data analysis and econometric modeling using code in a variety of contexts.

Teaching methods
  • Face-to-face lectures
  • Online lectures
  • Guest speaker's talks (in class or in distance)
  • Exercises (exercises, database, software etc.)
  • Group assignments

In addition to face-to-face lectures, the course will feature guest speakers’ talks, with particular reference to the availability of and modes of access to free and commercial databases on various economic, financial, and social aspects. A special emphasis will be also placed on practicing and self-assessment. Throughout the course, several sessions will be devoted to practicing and problem- solving. Finally, students will be requested to develop a group project. The project will consist of addressing an original research question, formulating hypotheses on the relevant causal relations, collecting and manipulating appropriate data, testing the hypothesized relations, and eventually presenting the main results.

Assessment methods
  Continuous assessment Partial exams General exam
  • Written individual exam (traditional/online)
  • x    
  • Group assignment (report, exercise, presentation, project work etc.)
  • x    

    The assessment of learning will be based on two criteria:


    1.  Four in itinere individual tests, two on the first and two on the second part of the course. Students can choose the best three out of the four tests. The average grade achieved in the three tests is worth 40% of the final grade.
    2. A group project. As mentioned in the teaching methods, students will have to develop, write and present an original paper. The final paper is worth 60% of the final grade. Both the in-class presentation and the final written paper will be evaluated.


    For each team, all students of the team must attend and actively contribute to the presentation. The skills in presenting the research done will be individually assesed.


    The final grade will be entirely based on an individual written exam at the end of the course.

    Teaching materials

    The first part of the course will be based on materials announced at the start of the course. For the second part, the material will include slides and selected papers communicated at the beginning of the course. The second part of the course will also rely on selected chapters from the following textbook: Hill, R. C., Griffiths, W. E., & Lim, G. C. (2018). Principles of econometrics. John Wiley & Sons.

    Last change 27/05/2023 12:38