Insegnamento a.a. 2025-2026

20597 - NATURAL LANGUAGE PROCESSING

Department of Computing Sciences


Class timetable
Exam timetable

Course taught in English
Go to class group/s: 23
DSBA (6 credits - II sem. - OB  |  ING-INF/05)
Course Director:
DEBORA NOZZA

Classes: 23 (II sem.)
Instructors:
Class 23: DEBORA NOZZA


Suggested background knowledge

To feel comfortable in this course, you should have good knowledge of programming in Python, as well as simple linear algebra (what are vectors and matrices, how are they multiplied) and probability theory (what is a probability distribution, what is conditional probability). Additional knowledge of data structures makes many of the applications easier to solve.

Mission & Content Summary

MISSION

Natural Language Processing tools are becoming ubiquitous: from everyday large language model applications like ChatGPT or Siri to high-stakes decision-making systems in industry or politics, and to text-driven research methods in the social sciences. With the rapid rise of LLMs and machine-learning–based text analysis, the field now offers powerful new possibilities and has become a key area of expertise. Whether it is using foundation models to explore text and uncover structures and topics, adapting LLMs for tasks such as sentiment classification, or understanding how these models shape and transform human–AI interaction, this course provides an overview and hands-on experience with all relevant techniques.

CONTENT SUMMARY

Preparation: how do I work with text

  • Data formats

  • Preprocessing

  • Storage and retrieval

Exploration: exploring structure in the data

  • Clustering

  • Topic models

  • Static and contextual embeddings

Prediction: finding patterns to impute new values

  • Text classification (sentiment, author attributes)

  • Logistic Regression

  • Perceptron

  • Feed-forward Neural Nets

  • Transformer architecture

  • BERT

Large Language Models

  • ChatGPT and generative models

  • Prompting and in-context learning

  • Fine-tuning and adaptation 

  • Retrieval-Augmented Generation (RAG)


Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Describe different text analysis problems.
  • Talk about the linguistic foundations.
  • Distinguish between exploration and prediction approaches.
  • Know which algorithm to choose for a given problem.
  • Understand the trade-offs between different approaches.

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Apply their knowledge to practical text analysis problems.
  • Implement a variety of algorithms for text exploration and classification in Python.
  • Work effectively with Large Language Models, including prompting, evaluation, and integration into text analysis workflows.

Teaching methods

  • Lectures
  • Guest speaker's talks (in class or in distance)
  • Practical Exercises
  • Individual works / Assignments
  • Collaborative Works / Assignments

DETAILS

  • Most lectures feature hands-on exercises in Jupyter notebooks.
  • Each student completes several individual assignments to get experience in implementation details.
  • Students work together in groups to solve a joint task.
  • If available, guest speakers from data-science companies present their work on text and language processing.

Assessment methods

  Continuous assessment Partial exams General exam
  • Individual Works/ Assignment (report, exercise, presentation, project work etc.)
x    
  • Active class participation (virtual, attendance)
x    

ATTENDING AND NOT ATTENDING STUDENTS

There is no distinction between attending and non-attending students.


Teaching materials


ATTENDING AND NOT ATTENDING STUDENTS

  • Lecture slides and notes provided on Bboard.
  • D. HOVY, Text Analysis in Python for Social Scientists: Discovery and Exploration, Cambridge University Press, 2020. (https://www.cambridge.org/core/elements/text-analysis-in-python-for-social-scientists/BFAB0A3604C7E29F6198EA2F7941DFF3)
  • JURAFSKY, DAN, J.H. MARTIN, Speech and language processing, Vol. 3. London, Pearson, 2014.
  • C.D. MANNING, H. SCHUTZE, Foundations of statistical natural language processing, MIT press, 1999.
  • S. MARSLAND, Machine learning: an algorithmic perspective, CRC press, 2015.
  • F. CHOLLET, Deep learning with Python, Manning Publications Co., 2017.
Last change 28/11/2025 08:24