Contatti

Insegnamento a.a. 2018-2019

20597 - NATURAL LANGUAGE PROCESSING

Department of Marketing

Course taught in English

Go to class group/s: 31

DSBA (6 credits - II sem. - OBCUR | ING-INF/05)

Course Director:
DIRK HOVY

Classes: 31 (II sem.)

Instructors:
Class 31: DIRK HOVY

Prerequisites

Prerequisites: To feel comfortable in this course, you should have good knowledge of simple linear algebra and probability theory, as well as programming in Python. Additional knowledge of data structures will make many of the assignments easier to solve.

Mission & Content Summary

MISSION

Natural Language Processing tools are becoming ubiquitous: from everyday tools like Siri or Alexa to decision making processes in industry or politics and to text analysis tools in social science research. Machine-learning based text analysis tools provide a range of possibilities and are a growing field of expertise. Whether it is the exploration of text to find structures and topics, or the construction of a classifier to predict the sentiment or author characterstics of a text, this course provides an overview and hands-on experience in all relevant techniques.

CONTENT SUMMARY

Preparation: how do I work with text:

Data formats.
Preprocessing.
Storage and retrieval.

Exploration: exploring structure in the data:

Clustering.
Topic models.
Word embeddings.

Prediction: finding patterns to impute new values:

Text classification (sentiment analysis, author attributes).
Logistic Regression.
Perceptron.
Feed-forward Neural Nets.
Convolutional Neural Nets.
Structured prediction (parts of speech, parsing, discourse).
Structured perceptron.
Recurent Neural Nets.

Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Describe different text analysis problems.
Talk about the linguistic foundations.
Distinguish between exploration and prediction approaches.
Know which algorithm to choose for a given problem.
Understand the trade-offs between different approaches.

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Apply their knowledge to a practical problem.
Implement a variety of algorithms for text exploration and classification in Python.

Teaching methods

Face-to-face lectures
Guest speaker's talks (in class or in distance)
Exercises (exercises, database, software etc.)
Individual assignments
Group assignments
Participation in external competitions

DETAILS

Guest speaker from various data-science companies will present their work on text and language processing.
Each lecture featureshands-on exercises in Jupyter notebooks.
Each student completes several individual assignments to get experience in implementation details.
Students work together in groups to solve a joint task.
If applicable/available, students have the option to participate in external competitions such as Kaggle competitions or shared tasks in natural language processing.

Assessment methods

	Continuous assessment	Partial exams	General exam
Individual assignment (report, exercise, presentation, project work etc.)	x
Group assignment (report, exercise, presentation, project work etc.)	x

ATTENDING AND NOT ATTENDING STUDENTS

Individual Assignment (50%)
Final Group project (50%): it is graded based on the performance of the system and the quality of the report.

Teaching materials

ATTENDING AND NOT ATTENDING STUDENTS

JURAFSKY, DAN, J.H. MARTIN, Speech and language processing, Vol. 3. London, Pearson, 2014.
C.D. MANNING, H. SCHUTZE, Foundations of statistical natural language processing, MIT press, 1999.
S. MARSLAND, Machine learning: an algorithmic perspective, CRC press, 2015.
F. CHOLLET, Deep learning with Python, Manning Publications Co., 2017.

Last change 03/07/2018 12:50