Insegnamento a.a. 2023-2024

20600 - DEEP LEARNING FOR COMPUTER VISION

Department of Computing Sciences

Course taught in English
Go to class group/s: 31
DSBA (6 credits - I sem. - OP  |  4 credits SECS-P/08  |  2 credits SECS-P/10)
Course Director:
GIACOMO BORACCHI

Classes: 31 (I sem.)
Instructors:
Class 31: GIACOMO BORACCHI


Suggested background knowledge

Students should be familiar with: Linear algebra; Rudiments of probability and statistics; Basics of machine learning and model fitting (overfitting and underfitting concepts): Neural networks (multi-layer perceptron and backpropagation); Python programming.

PREREQUISITES

Linear algebra, rudiments of probability and statistics. Basics of machine learning and model fitting (overfitting and underfitting concepts), neural networks (multi-layer perceptron and backpropagation). Good knowledge of Python.

Mission & Content Summary

MISSION

In recent years, deep neural networks demonstrated outstanding performance in solving many complex tasks. Specifically, in Computer Vision applications, such as Image Recognition, Object Recognition, Image Segmentation and Image Generation, deep learning approaches typically outperform traditional hand-crafted algorithms. This course offers a broad overview to the use of deep learning in Computer Vision, providing a solid understanding of convolutional neural networks (CNN) for image classification, till advanced models solving more sophisticated visual recognition tasks (image segmentation, object detection and image generation). The course's major goal is to provide students with the theoretical background and the practical skills to understand and use Convolutional Neural Networks for solving visual recognition problems.

CONTENT SUMMARY

Convolutional neural networks are mature, flexible, and powerful non-linear data-driven models that have successfully been applied to solve complex tasks in science and engineering. The advent of the deep learning paradigm, i.e., the use of neural networks to simultaneously learn an optimal data representation and the classification model, has further the data-driven paradigm. These topics will be described in the course according to the following detailed program: 

  • Basics of digital images, the image formation process. 

  • Basics of image transformations and image filtering (correlation and convolution) 

  • The Image Classification Problem and image classification by hand-crafted features 

  • Convolutional Neural Networks for Image Classification 

  • Famous CNN architectures, 

  • CNN training with data scarcity: transfer learning and data augmentation 

  • CNN Visualization nd CNN Explanations 

  • Fully Convolutional CNN and CNN for Image Segmentation 

  • Object Detection Network 

  • Unsupervised Models, Autoencoders 

  • Generative Adversarial Networks for Image Generation 


Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Identify the right CNN architecture to solve different visual recognition problems 
  • Recognize the best practices, leveraging the most popular dropout, data augmentation 
  • Describe and get inspiration from the most successful Deep Learning architectures 
  • Explain the most successful Computer Vision applications to be solved by Deep Learning models 
  • Illustrate complex techniques beyond the fundamental ones presented during lectures 

 

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Analyze a specific Computer Vision problem and find which model best solves the task at hand 

Use fundamental deep learning algorithms for Computer Vision autonomously 

Compare the various models and find the most relevant to be applied in the specific problem 

Examine the selected model in order to balance performance, computational complexity and overfitting 

Discuss the pros and cons of different Computer Vision techniques for a specific problem 

Develop new pipelines adapting to the specific problem at hand 


Teaching methods

  • Face-to-face lectures
  • Exercises (exercises, database, software etc.)
  • Individual assignments
  • Group assignments

DETAILS

The course follows an interactive and hands-on teaching modality with a strong emphasis on practical aspects. On top of the laboratory sessions, customarily held after most lectures, the course leverages project-based learning to enable students to apply the principles covered during lectures to real-world computer vision tasks. 
During Practical Session carefully selected sample codes cover the key components of image analysis, and convolutional neural networks for image classification, segmentation, object recognition, and image generation. Students are encouraged to follow along and experiment with the code to gain a solid grasp of the underpinning concepts. 
Projects are assigned to groups to foster a deeper understanding of the subject. The students are divided into teams, and will phase two step-projects. The first phase, which will take place during the first half of the course, is meant to teach the students how to use CNN models for solving a basic visual recognition task. In the second phase, students are invited to choose a specific computer vision problem to be solved by advanced deep learning models. The projects need to be diverse among the teams, challenging, and relevant to current real-world applications. 
During the project development, students are expected to take advantage of the methods and skills presented during lectures for solving their specific task. At the end of the course, each team presents their projects to the entire class. This presentation fosters a collaborative learning environment where teams can learn from each other's successes and challenges. 


Assessment methods

  Continuous assessment Partial exams General exam
  • Written individual exam (traditional/online)
    x
  • Group assignment (report, exercise, presentation, project work etc.)
    x

ATTENDING STUDENTS

Students’ assessment is based on two main components: 

1. Project: (55% of the final grade) aimed at assessing the student proficiency in: 

- identifying the right CNN architecture to solve a visual recognition problems 

- recognizing the best practices, leveraging the most popular dropout, data augmentation 

- finding which model best solves the task at hand 

- using fundamental deep learning algorithms for Computer Vision autonomously 

- developing new pipelines adapting to the specific problem at hand. 

The evaluation of the project will be based on a written report presenting the methodology adopted and the outcomes attained. 

 

2. Written exam (45% of the final grade), consisting of closed questions aimed to assess students’ ability to: 

- explain the most successful Computer Vision applications to be solved by Deep Learning models 

- Illustrate complex techniques  

- apply the analytical tools illustrated during the course. 


NOT ATTENDING STUDENTS

Students’ assessment will be based on a written exam aimed at discussing their expertise 


Teaching materials


ATTENDING STUDENTS

Slides and Links to reference papers will be distributed. Also Colab notebooks will be provided.


NOT ATTENDING STUDENTS

Slides and Links to reference papers will be distributed. Also Colab notebooks will be provided. 

Last change 28/07/2023 10:51