Insegnamento a.a. 2020-2021

20592 - STATISTICS AND PROBABILITY

Department of Decision Sciences

Course taught in English
Go to class group/s: 23
DSBA (8 credits - I sem. - OB  |  2 credits MAT/06  |  6 credits SECS-S/01)
Course Director:
SONIA PETRONE

Classes: 23 (I sem.)
Instructors:
Class 23: SONIA PETRONE


Mission & Content Summary

MISSION

A solid background in Probability and Statistics is an absolute MUST for a data scientist, in whatever field she/he is willing to work. This course aims at providing such a solid methodological background. We start with a recap of fundamental notions in probabiilty theory and stochastic processes (in particular, Markov chains), presented in a friendly but rigorous way. We then go to classical statistical inference, giving the basis of maximum likelihood estimation, confidence intervals and tests, to end with an introduction to Bayesian learning. In this all, we will have in mind the "exlain or predict" big debate; simplifying a lot, "classical statistics" towards "modern statistics" and machine learning. The course is completed with a modulo on computational methods (stochastic integration & Monte Carlo, optimization, bootstrap, Markov chain Monte Carlo), with Python. The lectures include frontal lecturing, group work with periodic assigments, coding and simulations.

CONTENT SUMMARY

 

PART I : Probability recap

- Definition and basic properties

- Random variables. Multivariate distributions

- Expectation and conditional expectation.

- Convergence of random variables.

 Basic notions on stocastic processes. Random noise. Random walks. Markov chains.

 

Part II : Statistical inference
- Models, Statistical Inference and Learning
- Elements of nonparametric estimation.

- The bootstrap.

- Parametric Inference

- MLE and asymptotics

- Confidence intervals

- Hypothesis testing and p-values

 

PART III - Bayesian learning

- Fundamentals of Bayesian learning

- Bayes rule and examples.

- Bayesian linear regression (if time permits)

 

ALL OVER: Computational methods

  • Stochastic integration and Monte Carlo.
  • Optimization. EM algorithm.
  • Parametric bootstrap (if time permits)
  • Markov Chain Monte Carlo.

Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Define, describe and explain rigorously the main notions of probability and statistical learning in the frequentist and Bayesian approach.

 

* Identify computational strategies for fundamental complex problems

 

* Recognize the role of probability and statistics in "data science" and related fields

 

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Estimate and predict, and quantify uncertainty, in fundamental problems
  • Write algorithms in Python for the implementation of computational statistic techniques, namely optimization and integration techniques.

Teaching methods

  • Face-to-face lectures
  • Exercises (exercises, database, software etc.)
  • Individual assignments
  • Group assignments

DETAILS

Students will be given periodic group or individual assignments,  on the theory and on the implementation of computational methods (with Python).


Assessment methods

  Continuous assessment Partial exams General exam
  • Written individual exam (traditional/online)
    x
  • Group assignment (report, exercise, presentation, project work etc.)
x   x

ATTENDING AND NOT ATTENDING STUDENTS

ASSIGNMENTS:

Students wil be given periodic assigments, on the theory and on the computational methods presented in class.

The assigments (take-home) will be done in groups (up to 5  people). The assigments are very important to encourage students to follow and verify their understanding!

They are not formally evaluated, but **students who do not deliver the assigments will have additional questions in the written proof**

 

EXAM:

The exam will consist in an individual written proof (unfortunately, from remote), that will count 70%, and a final project on computational methods, that counts 30%.

 

NOTE 1: The final project is done in groups, while the written proof is individual. Therefore, the written proof may count 100% if poorly done.

 

NOTE 2: The exam structure might be slightly modified, in order to accomodate for the possible difficulties due to the COVID-19 pandemics, taking into account th students' needs. In that case, students will be promptly informed, through BBoard announcements and more.


Teaching materials


ATTENDING AND NOT ATTENDING STUDENTS

textbook

L. Wasserman, "All of Statistics", Springer

 

More teaching material, lecture notes, Python code etc will be provided on BBoard.

Last change 03/09/2020 02:14