- Didattica
- Working papers
- Pubblicazioni
- Research Interests
- Education is not about filling buckets but lighting fires
- Austerity. When it Works and When it doesn't
- Applied Macroeconometrics
- Notes on the Econometrics of Asset Allocation and Risk Measurement
- PhD supervision
- I Pellicani (Bocconi Sport Team)
- CV
- Recapiti

## Didattica > Materiali didattici

### 20630 Introduction to Sport Analytics

This course provides the analytics requirements of a Sports Management program. It is also an opportunity for applied work for all students interested in Data Science. All applications in the course will be based on the statistical software R. The course is taught through a combination of lectures, class discussion, group presentations. Students are required to read assignments from the texts as well as additional sources provided by the instructor. Students must attend class prepared to engage in discussions; have, articulate and defend a point of view; and ask questions and provide comments based on their reading and on their own R applications.

Projects will be allocated to groups of attending students. Project reports and their presentation will be part of the evaluation for attending students.

Presentations on the use of analytics in the Sport Business:

Using Analytics for a Euroleague Basketball Team, presentation by Mario Fioretti, Assistant Coach, Olimpia Milano

Using Analytics in the European Soccer Industry, presentation by Mark Nervegna, Head of Strategy and Analytics, Raiola Global

**Pre-Requisites: **Students are expected to have attended a core course in statistics and to be familiar with basic calculus and linear algebra.

Teaching Assistant: Office Hours will be held online viaTeams, the Teaching Assistant will follow students both on projects and on exercises

Gabriele Carta, gabriele.carta@unibocconi.it, office hours

Past Exams: 2019_1, 2019_2

Mock Exam May 2023: exam, data, R code with solutions

Exam 23rd May 2023: exam, data, R code with solutions

Dynamic Documents with R Markdown

build a report with all results and comments

An introduction to R Markdown

an illustrative R Markdown code

Github and Github Desktop

A tutorial online

Project 1: Getting sport data from the web with R

The objective of this project is to illustrate how data on sports could be efficiently retrieved from the Web (via API and/or webscraping). Students should feel free to choose their preferred field and application.

Accessing APIs from R a tutorial , an R code for the tutorial, Accessing data from Github using an R code

Project 2: Creating Web Applications with Rshiny

The objective of this project is to create a Sport related web application with RShiny. An Illustration based on NBA data is provided together with projects produced in 2020. Students should feel free to choose their preferred field and application.

Slides of Andrea Maver's presentation

Online tutorials on mastering RShiny

Learning Shiny with NBA DATA (by Julia Wrobel),

http://juliawrobel.com/tutorials/shiny_tutorial_nba.html, https://andreamaver.shinyapps.io/EuroleagueApp/

Programmes for NBA Shiny short version , Programmes for NBA shiny long version

Rshiny example

Instructions for those who have opted for the Shiny Project in 2020 are available HERE.

Project 3: An Application of Unsupervised Machine Learning to Sport Analytics

The objective of this project is to apply unsupervised machine learning, and in particular cluster analysis, to finding groups in Sport Analytics data.

P. Zuccolotto and M. Manisera (2020)* Basketball Data Science – With Applications in R, *Chapman and Hall/CRC. (Chapter 4)

link to basketball analyzeR: https://bdsports.unibs.it/basketballanalyzer/

James, Witten, Habstie and Tibshirani (2011) An Introduction to Statistical Learning- With Applications in R

LINK to the recorded Presentation of the Cluster Analysis project 2020: https://eu-lti.bbcollab.com/recording/8570729e9532435b951e9b40de8470a5

SLIDES and Rmd codes

Project 4: An Application of Supervised Machine Learning to Sport Analytics

The objective of this project is to apply supervised machine learning techniques , and in particular techniques to solve the many predictor problem to predict top athletes compensations.

Students should use as a benchmark the model presented in the lectures and evaluate it against alternatives generated by modern machine learning techniques.

A further possibility for a group undertaking this project is the costruction of a data challenge related to the topic of the project using the data challenge website of Bocconi University.

James, Witten, Habstie and Tibshirani (2011) An Introduction to Statistical Learning- With Applications in R,

Stock J. and M.Watson (2020) Introduction to Econometrics, 4th edition, Chapter 14

Project 5: Evaluating the Home Advantage Effect from quasi-Natural Experiments

Following the COVID shock many games in many sport were played without attendance within "bubbles" in which no team had the "home advantage effect". The objective of this project is to use sport data to construct a quasi-natural experiment for the evaluation of the Home Advantage Effect.

Stock J. and M.Watson (2020) Introduction to Econometrics, 4th edition, Chapter 13

Presentation of N.Sita(2020) thesis on Evaluating the Home Advantage in NBA

**Project 6: Measuring Competitive Advantage and its effects **

The objective of this project is to introduce, discuss the concept of Competitive Balance in the Sport Industry. Both a discussion of the theory and applications are possible.

PRESENTATION SLIDES

Berri D.J.,M.B.Schmidt and S. Brook(2006), The Wages of Wins, Stanford University Press, Ch 3,4

Brandes L. and E.Franck(2007) "Who made who? An Empirical Analysis of Competitive Balance in European Soccer Leagues" Eastern Economic Journal

Haddock D. and L.P.Cain(2006) "Measuring Parity:Tying into the Idealized Standard Deviation", Journal of Sport and Economics

Koning R.H.(2000) Balance in competition in Dutch soccer, The Statistician, 49, Part 3, pp.419-431

Szimansky S.(2001) "Income inequality, competitive balance and the attractiveness of team sports:some evidence and a natural experiment from English Soccer" the Economic Journal,111, F69-F84

**Project 7: Load Management and Injury Risk **

A recent report denied the existence of a significant statistical relationship between load management and injury risk in the NBA. The objective of this project is a critical analysis of the report, which will be made available to the groups taking this choice.

https://www.espn.com/nba/story/_/id/39288379/nba-report-no-link-load-management-less-injury-risk

**Project 8: The Relevance of Popular Shareholding Contribution to Team Perfomance **

A recent report provided evidence on the popular shareholding contribution to team perfomance in european soccer. The objective of this project is a critical analysis of the report, which will be made available together with the original data to the groups taking this choice

**Course Content Summary**

**Section 1: Sport Analytics. an Introduction**

SLIDES

The Questions in Sport Analytics.

The Answers

Modelling Data in Sports

Theory Based Models

Supervised Machine Learning

Unsupervised Machine Learning

**References**

Berri D.J.,M.B.Schmidt and S. Brook(2006), The Wages of Wins, Stanford University Press

Berri D.J., M. B. Schmidt (2010) Stumbling On Wins.Two Economists Expose the Pitfalls on the Road to Victory in Professional Sports-FT Press

Goldsberry K.(2019) Sprawlball. A visual tour of the new era of NBA, Houghton Mifflin Harcourt

James, Witten, Habstie and Tibshirani (2011) An Introduction to Statistical Learning- With Applications in R,

Shea S.(2014) Basketball analytics. Spatial Tracking

P. Zuccolotto and M. Manisera (2020)* Basketball Data Science – With Applications in R, *Chapman and Hall/CRC.

Winston W.L.(2009) Mathletics, Princeton University Press

**Section 2: An introduction to R**

SLIDES

Install R and R studio on your computer and learn how to run them

Learn what is a package and how to install it

Understand what is a view

define a default directory

have some fun with R Shiny

An online introduction to R

R Code

Torfs Brauer "A Very Short Intro to R" , SOLUTIONS FOR the Torfs-Brauer TO DO LIST

**Data-Objects in R**

Data Objects in R (data types) and Data Structures In R (Vectors, Matrices, Arrays, Data Frames, Lists)

Data Handling in R

Importing and Exporting, transforming and selecting data

Getting Data from the web with R

Programming and Control Flow

if-else statements, using switch, loops, functions in R

all R codes used in Singh and Allen are downloaded at

http://www.rforresearch.com/r-in-finance-economics

**R CODES** (from Singh and Allen) : Data Objects, Data Handling, Getting Data from the web, Programming, binomial model included

**References**

Singh AK and DE Allen(2017) R in Finance and Economics. A Beginners Guide, World Scientific Publishing, Ch 1,2,3,4

Heiss F. (2016) Using R for introductory Econometrics http://urfie.net/read/mobile/index.html#p=4,

Yihui Xie, Dynamic Documents with R and Knitr, Chapman and Hall

**EXERCISE 1** Write an R code that answers to all the ToDo points in Torfs P. and C. Bauer(2014) “A (very short) introduction to R” ,

**EXERCISE 2** An introduction to Data Handling, SOLUTION

**Section 3: Graphical and Descriptive Analysis of Sport Statistics (NBA data)**

SLIDES

Graphical Analysis

Correlation Analysis

QQ plots and Histogram

Subsetting data and TS plots

Introduction to model building and Simulation

The NBA database: download and import in R. teamsoverall2023.csv, datafiles, programme to build database from datafiles

https://www.basketball-reference.com/leagues/NBA_2023.html, programme to update data by webscraping

**R CODES** : code1, code2, please not that you need to create Teams_overall2023.csv to run the codes

**EXERCISE 3: **text, code

**Section 4: The Linear Regression Model **

SLIDES 1

SLIDES 2

Models for Experimental and non-Experimental Data

Models as outcomes of reduction processes

Model Estimation: the OLS and its properties

Interpreting Regression Results: Statistical Significance and Relevance

The Effects of Model Misspecification

**AN APPLICATION,THE FOUR FACTOR MODEL **R code

**EXERCISE 4: **The Four Factor Model, NOTES , solution

**References**

Winston W.L.(2009) Mathletics, Princeton University Press, Chapter 28

**Section 5: Using Models to Weight NBA Statistics **

SLIDES 1, SLIDES 2

Weighting Statistics to measure performance

Correlation analysis

The NBA Efficiency Measure

Using a Model based on Possession

Offensive Efficiency and Defensive Efficiency

Modelling Wins

Evaluating Statistics by Simulation: Monte-Carlo and Bootstrap methods

Completing the Model

Evaluating Players' Efficiency: WINS, assists and WINS48

**R CODES**: team_stat , players_stat, data on players,** **NOTES

**EXERCISE 5: **text, SOLUTION , SOLUTION AS RMD

**EXERCISE 6: ** text, notes, SOLUTION

**References**

Berri D.J.,M.B.Schmidt and S. Brook(2006), The Wages of Wins, Stanford University Press, Ch 6,7